Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot "unregister" a function, collation or aggregate in SQLite3 extension #10726

Open
bohwaz opened this issue Feb 27, 2023 · 13 comments
Open

Comments

@bohwaz
Copy link
Contributor

bohwaz commented Feb 27, 2023

Description

The following code:

<?php
$db = new SQLite3(':memory:');
$db->createFunction('like', 'strcmp');
$db->createFunction('like', null);

Resulted in this output:

Argument #2 ($callback) must be of type callable, null given

But I expected to be able to unregister the custom like function, as documented in SQLite doc:

To delete an existing SQL function or aggregate, pass NULL pointers for all three function callbacks.

https://www.sqlite.org/c3ref/create_function.html

Passing NULL to SQLite3::create* methods should un-register the existing function.

PHP Version

Any

Operating System

No response

@nielsdos
Copy link
Member

Yeah it looks indeed like the PHP side for createFunction does not handle the NULL case at all, i.e. it's a little more work (but not too bad I think) than just adapting the parameter parsing code.
Tbh I'd find it cleaner if we have a deleteFunction function, but it's probably best to mirror the sqlite3 api.
I can try to take a look at this soon-ish.

@Danack
Copy link
Contributor

Danack commented Mar 1, 2023

@nielsdos if you fix it great, if not I might do it as part of the 'PDO subclasses RFC' which I'm meant to be working on.

@bohwaz
Copy link
Contributor Author

bohwaz commented Mar 1, 2023

That would be great :)

I don't have time to make a PR for this currently, but our non-profit can contribute a (small) bounty towards the fix if someone has time :)

And yes, I would also prefer having deleteFunction, but:

  • that means having 3 new methods
  • setAuthorizer already uses NULL to revert to default authorizer
  • SQLite lib uses NULL and documents it, so someone reading the SQLite doc will probably try (like me) to use NULL as well

I had a look at the code, we should retrieve the function from the collation/function/aggregate dictionary, and un-register it.

Please note that to "unregister" a function, we have to use the same encoding and number of arguments when calling sqlite3_create_function_v2, as noted here: https://www.mail-archive.com/sqlite-users@mailinglists.sqlite.org/msg100765.html

@nielsdos
Copy link
Member

nielsdos commented Mar 1, 2023

@Danack Ah, I already had started on implementing this and I think I'll continue working on this. Although my time for today is up so the PR will be for tomorrow :)
@bohwaz Thanks for the pointers, I'll adapt the creation functions.

@nielsdos
Copy link
Member

nielsdos commented Mar 2, 2023

I ran into a bit of a roadblock. First of all, changing the signature of those 3 functions from callable to ?callable is actually a BC break because the two classes (SQLite3 & PDO) are extendible by user-defined classes. So this is master material and probably requires an RFC.

Second, using null to unregister creates some really awkward APIs:

public SQLite3::createAggregate(
    string $name,
    ?callable $stepCallback,
    ?callable $finalCallback,
    int $argCount = -1
): bool

In this case $stepCallback and $finalCallback must either both be non-null or both be null.

Since createAggregate and createFunction both use the sqlite3_create_function API behind the scenes, the introduction of deleting options for both create functions is furthermore duplication of functionality. The createCollation uses a different function behind the scenes.
Since we have a linked list of registered functions anyway, we can just store which kind of function it was and have one delete function instead of adapting the create functions (or introducing 3 delete functions). This would prevent BC breaks and would be a cleaner API imo.

@bohwaz
Copy link
Contributor Author

bohwaz commented Mar 4, 2023

Yes @nielsdos I agree, for createFunction using NULL is straightforward, but createAggregate would be weird. Also the fact that you have to supply the same argCount could lead to weird behaviour.

Creating deleteFunction/deleteAggregate/deleteCollation methods seems the right path then.

It wouldn't be too off with setAuthorizer accepting NULL, as it's not called "createAuthorizer", so setAuthorizer(null) makes more sense.

As for making the PHP API similar to the SQLite C API, I believe that having delete* methods in the documentation should be quite clear :)

All good for me to go with delete* :)

@nielsdos
Copy link
Member

nielsdos commented Mar 4, 2023

We can still choose between 3 delete functions versus 1 delete function as PHP can internally bookkeep what kind of function it is. Since @Danack is already doing an RFC it maybe makes more sense that he incorporates this into his work instead of me creating a separate RFC, if Danack's okay with that :)

@bohwaz
Copy link
Contributor Author

bohwaz commented Mar 4, 2023

AFAIK the work @Danack is doing is only on PDO, not on the SQLite3 extension, so it would still be nice to implement it in SQLite3 :)

Because driver-specific methods are forbidden in PDO currently ( sad ), I implemented missing SQLite features only in SQLite3 for now. If at some point we can get all missing features in PDO, and deprecate the SQLite3 extension, it would be great as we wouldn't have to maintain both, and as a PHP developer we wouldn't have to make the tough choice between PDO and SQLite3 depending on which features we need. But that's another topic. Hopefully @Danack RFC will pass and we will be able to catch up in pdo-sqlite.

@Danack
Copy link
Contributor

Danack commented May 6, 2023

Okay, so.

I had a look at implementing these and while they're easy to implement, I really don't like the current API for either PDO_SQLITE or Sqlite3 for creating these callbacks. Possibly that's because I don't understand them, but my concerns are:

Charset not exposed.

The charset that they functions work on isn't exposed. The constants SQLITE_UTF8, SQLITE_UTF16LE, SQLITE_UTF16BE and SQLITE_UTF16 should probably be exposed and be passable as a parameter for all of the callbacks, and SQLITE_UTF16_ALIGNED for collations.

Having to pass argc is weird and annoying

The delete functionality depends on having the name, argc and charset of the function to delete be the same as that of the created callback. If you call the delete function with the same name, but a different argc, it returns SQLITE_OK but doesn't actually delete anything.

Having to 'remember' what you passed in as argc for a callback to be able to delete it sounds like an "ungood" API choice.

It seems there are two cases:

  • the callback is non-variadic, and we should be able to get the argc from the callback.
  • the callback is variadic, and the argc should be -1, probably.

I'm really tempted to split adding these methods to the PDOSqlite class into a separate RFC, so that the API can be thought about by themselves, rather than being lost in the inevitable opinions people are going to have about the capitalisation of the classes.

Thoughts?

Edit
Ugh, but people can register the same function name with different numbers of args....which means we can't work back from the function name to the number of args...

@nielsdos
Copy link
Member

nielsdos commented May 6, 2023

Agreed on the charset issue.

About argc and

Ugh, but people can register the same function name with different numbers of args....which means we can't work back from the function name to the number of args...

Yeah this needs some discussion. One idea could be to require the user to pass in the callback function when they delete a sqlite function, such that argc could be derived from that, just like how you suggested it for adding a callback function. But that might be more inconvenient in some cases than having an argc argument and passing in the right argc?

@Danack
Copy link
Contributor

Danack commented May 7, 2023

I'm really leaning towards not including either the unregister or register functions in the Pdo subclasses RFC.

They (imo) need to have the charset aka 'Text Rep" parameter exposed, but that means that someone who understands the following needs to say what the code needs to do internally:

The fourth parameter, eTextRep, specifies what text encoding this SQL function prefers for its parameters. The application should set this parameter to SQLITE_UTF16LE if the function implementation invokes sqlite3_value_text16le() on an input, or SQLITE_UTF16BE if the implementation invokes sqlite3_value_text16be() on an input, or SQLITE_UTF16 if sqlite3_value_text16() is used, or SQLITE_UTF8 otherwise.

The fourth parameter, eTextRep, specifies what text encoding this SQL function prefers for its parameters. The application should set this parameter to SQLITE_UTF16LE if the function implementation invokes sqlite3_value_text16le() on an input, or SQLITE_UTF16BE if the implementation invokes sqlite3_value_text16be() on an input, or SQLITE_UTF16 if sqlite3_value_text16() is used, or SQLITE_UTF8 otherwise. The same SQL function may be registered multiple times using different preferred text encodings, with different implementations for each encoding. When multiple implementations of the same function are available, SQLite will pick the one that involves the least amount of data conversion.

The fourth parameter may optionally be ORed with SQLITE_DETERMINISTIC to signal that the function will always return the same result given the same inputs within a single SQL statement. Most SQL functions are deterministic. The built-in random() SQL function is an example of a function that is not deterministic. The SQLite query planner is able to perform additional optimizations on deterministic functions, so use of the SQLITE_DETERMINISTIC flag is recommended where possible.

The fourth parameter may also optionally include the SQLITE_DIRECTONLY flag, which if present prevents the function from being invoked from within VIEWs, TRIGGERs, CHECK constraints, generated column expressions, index expressions, or the WHERE clause of partial indexes.

For best security, the SQLITE_DIRECTONLY flag is recommended for all application-defined SQL functions that do not need to be used inside of triggers, view, CHECK constraints, or other elements of the database schema. This flags is especially recommended for SQL functions that have side effects or reveal internal application state. Without this flag, an attacker might be able to modify the schema of a database file to include invocations of the function with parameters chosen by the attacker, which the application will then execute when the database file is opened and read.

Currently, there is a hard-coding of flags | SQLITE_UTF8.

@bohwaz
Copy link
Contributor Author

bohwaz commented May 8, 2023

I think that the register and unregister should be exposed in PDO, and that we can continue defaulting to UTF-8, as I don't think we've had any request for supporting something else, so it seems "good enough" for me. But that doesn't fix the issue of missing unregister in SQLite3 :)

As for argc, yes this is bad, we should be able to provide something simpler, like @nielsdos suggested, or maybe save the callback in an array and have the register method return a resource that can be used to unregister:

$resource = $db->registerFunction('like', [SQLite3Utils::class, 'unicodeLike']);
$db->unregisterFunction($resource);

This is the same behaviour as used in Javascript for addEventListener/removeEventListener, but I'm not sure if this behavior exists somewhere else in PHP.

Instead of returning a resource, an integer could be returned maybe. Not sure what's the best here.

@Danack
Copy link
Contributor

Danack commented May 12, 2023

or maybe save the callback in an array and have the register method return a resource that can be used to unregister:

That was one of my gut instincts....but I'm not sure it actually helps. People using that would still need to track which functions are setup from which parameters, to know which one to unregister.

i.e. they would be looking up the resource from an array, based on the function name, params and charset....but they can already track that themselves in userland based on the same parameters.

Probably the most sensible API is just:

PdoSqlite::createFunction(
  string $function_name,
  callable $callback,
  int $num_args = -1,
  int $flags = 0
): bool {}


PdoSqlite::deleteFunction(
  string $function_name,
  int $num_args = -1,
  int $flags = 0
): bool {}

Though I'm even more convinced it needs a separate RFC to the PDO subclassing one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants