Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate ext/odbc resources to opaque objects #12040

Merged
merged 11 commits into from
Apr 28, 2024

Conversation

kocsismate
Copy link
Member

@kocsismate kocsismate commented Aug 23, 2023

Comment on lines 149 to 152
int i;

if (res->values) {
for(i = 0; i < res->numcols; i++) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be as follows:

Suggested change
int i;
if (res->values) {
for(i = 0; i < res->numcols; i++) {
if (res->values) {
for (int i = 0; i < res->numcols; i++) {

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI: this is a preexisting piece of code, originally coming from _free_odbc_result but I'm ok with fixing this.

ext/odbc/php_odbc.c Outdated Show resolved Hide resolved
ext/odbc/php_odbc.c Outdated Show resolved Hide resolved
ext/odbc/php_odbc.c Show resolved Hide resolved
Comment on lines 513 to 514
zend_hash_destroy(&ODBCG(results));
zend_hash_destroy(&ODBCG(connections));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this happen in GSHUTDOWN? Considering they are init in GINIT.

Also if we don't destroy them per request, wouldn't this allow us to also remove the persistent resource type?

Maybe one "generic" way of doing this to check on the persistent flag via a hash_apply function (or whatever the name of those are)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this happen in GSHUTDOWN? Considering they are init in GINIT.

Hmm, great question... I used ext/pgsql as a reference where the same happens.. According to what I could distill from the externals book, the main difference between MINIT and GINIT is this:

GINIT() is launched before MINIT() for the current process. In case of NTS, that’s all. In case of ZTS, GINIT() will be called additionally for every new thread spawned by the thread library.

So I guess I should rather destroy the hashes in GSHUTDOWN if I'm not mistaken...

Also if we don't destroy them per request, wouldn't this allow us to also remove the persistent resource type?

Yeah, this crossed my mind as well, but I'm not sure about it, as the same was done in case of mysqli and pgsql as well: #6791 (comment) But we should definitely discuss the feasibility of your idea with the PHPF. :)

RETURN_THROWS();
}
result = Z_ODBC_RESULT_P(pv_res);
CHECK_ODBC_RESULT(result);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Drive by remark for bellow, the strcasecmp() call looks fishy as it doesn't deal with binary strings which might be passed as the fname value from userland

}
conn = Z_ODBC_CONNECTION_P(pv_handle);
CHECK_ODBC_CONNECTION(conn);

if (mode == 0) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This parameter name is really bad as I was confused where the hell the mode value came from, as I didn't see it from the userland call.

ext/odbc/php_odbc.c Show resolved Hide resolved
ext/odbc/php_odbc.c Outdated Show resolved Hide resolved
ext/odbc/php_odbc.c Outdated Show resolved Hide resolved
Copy link
Member

@dstogov dstogov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks more or less good, except for the "permutation changes" that I can't follow.

@@ -240,8 +246,8 @@ ZEND_BEGIN_MODULE_GLOBALS(odbc)
zend_long default_cursortype;
char laststate[6];
char lasterrormsg[SQL_MAX_MESSAGE_LENGTH];
HashTable *resource_list;
HashTable *resource_plist;
HashTable results;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to keep results?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the list is needed in two places:

  • all results are closed when odbc_close_all() is called
  • when a connection is closed, its related results are closed as well

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I see it well, the results are not yet added anywhere to the hash table...

Comment on lines -223 to +229
odbc_param_info * param_info;
odbc_param_info *param_info;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would recommend not to mix white-space and unrelated changes with a big patch. You may commit the obvious changes before.

Comment on lines 194 to 203
zend_resource *res;
int persistent;
} odbc_connection;

typedef struct odbc_link {
odbc_connection *connection;
zend_string *hash;
bool persistent;
zend_object std;
} odbc_link;

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would keep persistent flag in odbc_connection.
May be I didn't understand why it was moved...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's easier to access it in the link structure, and we don't need it in the connection. However, I can bring it back if you want. :)

ext/odbc/php_odbc.c Show resolved Hide resolved
Copy link
Member

@dstogov dstogov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't care.

@kocsismate kocsismate force-pushed the odbc-resource branch 2 times, most recently from 8f743f1 to 756b7e1 Compare September 28, 2023 21:19
@@ -316,31 +316,6 @@ static const func_info_t func_infos[] = {
FN("oci_get_implicit_resultset", MAY_BE_RESOURCE|MAY_BE_FALSE),
FN("oci_password_change", MAY_BE_RESOURCE|MAY_BE_BOOL),
FN("oci_new_cursor", MAY_BE_RESOURCE|MAY_BE_FALSE),
FN("odbc_prepare", MAY_BE_RESOURCE|MAY_BE_FALSE),
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I know, these always return a new instance, thus they can be annotated as @refcount 1, cannot they?

Copy link
Member

@Girgias Girgias left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you rebase this just to check nothing is broken, and add the UPGRADING entries?

Otherwise this LGTM.

@kocsismate
Copy link
Member Author

@NattyNarwhal Could you please have a look at this PR before I merge it?

@NattyNarwhal
Copy link
Member

I'm on vacation this week, but I'll try to review this Monday night.

@NattyNarwhal
Copy link
Member

I haven't been able to set up an environment to test with yet (I'll probably give it another pass because of this), but I did go over the changes beyond the mechanical changing to an object - they're sensible refactors in general.

function odbc_connection_string_should_quote(string $str): bool {}

function odbc_connection_string_quote(string $str): string {}
namespace ODBC {
Copy link
Member

@SakiTakamachi SakiTakamachi Mar 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If follow the naming convention, wouldn't it be Odbc?

Namespace naming conventions were irrelevant

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, yes, namespaces have different rules, but I am fine to rename it, especially because Tim's https://wiki.php.net/rfc/class-naming-acronyms will hopefully/likely pass.

odbc_result *result;
RETCODE rc;
zval *pv_handle;
zend_long pv_which, pv_opt, pv_val;

if (zend_parse_parameters(ZEND_NUM_ARGS(), "rlll", &pv_handle, &pv_which, &pv_opt, &pv_val) == FAILURE) {
if (zend_parse_parameters(ZEND_NUM_ARGS(), "olll", &pv_handle, &pv_which, &pv_opt, &pv_val) == FAILURE) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe?

Suggested change
if (zend_parse_parameters(ZEND_NUM_ARGS(), "olll", &pv_handle, &pv_which, &pv_opt, &pv_val) == FAILURE) {
if (zend_parse_parameters(ZEND_NUM_ARGS(), "Olll", &pv_handle, odbc_connection_ce, &pv_which, &pv_opt, &pv_val) == FAILURE) {

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not possible unfortunately, since the function also accepts a result_ce when the 3rd argument has a value of 2.

@NattyNarwhal
Copy link
Member

Finally back from vacation, so I can test this.

FWIW, I know of at least one library (i.e. zendtech/ibmitoolkit) using is_resource on ODBC resources, but this will be communicated in UPGRADING, and I'lll work to get it fixed upstream in that library before 8.4 comes out.

However, I've been testing the behaviour with the library right now, and I'm getting heap corruption with macOS 14.4/arm64 and IBM's Db2i driver, and I can't reproduce it with current master:

% lldb -- ~/src/php-src/sapi/cli/php test.php     
(lldb) target create "/Users/calvin/src/php-src/sapi/cli/php"
Current executable set to '/Users/calvin/src/php-src/sapi/cli/php' (arm64).
(lldb) settings set -- target.run-args  "test.php"
(lldb) run
Process 48209 launched: '/Users/calvin/src/php-src/sapi/cli/php' (arm64)

Warning: odbc_connect(): SQL error: Failed to fetch error message, SQL state HY000 in SQLConnect in /Users/calvin/src/toolkit-user/test.php on line 13
bool(false)
Process 48209 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x10187d70000)
    frame #0: 0x0000000100609614 php`zend_mm_alloc_small(heap=0x0000000101800040, bin_num=15, __zend_filename="Zend/zend_string.h", __zend_lineno=176, __zend_orig_filename=0x0000000000000000, __zend_orig_lineno=0) at zend_alloc.c:1314:33
   1311	
   1312		if (EXPECTED(heap->free_slot[bin_num] != NULL)) {
   1313			zend_mm_free_slot *p = heap->free_slot[bin_num];
-> 1314			heap->free_slot[bin_num] = p->next_free_slot;
   1315			return p;
   1316		} else {
   1317			return zend_mm_alloc_small_slow(heap, bin_num ZEND_FILE_LINE_RELAY_CC ZEND_FILE_LINE_ORIG_RELAY_CC);
Target 0: (php) stopped.
(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x10187d70000)
  * frame #0: 0x0000000100609614 php`zend_mm_alloc_small(heap=0x0000000101800040, bin_num=15, __zend_filename="Zend/zend_string.h", __zend_lineno=176, __zend_orig_filename=0x0000000000000000, __zend_orig_lineno=0) at zend_alloc.c:1314:33
    frame #1: 0x0000000100606894 php`zend_mm_alloc_heap(heap=0x0000000101800040, size=232, __zend_filename="Zend/zend_string.h", __zend_lineno=176, __zend_orig_filename=0x0000000000000000, __zend_orig_lineno=0) at zend_alloc.c:1385:9
    frame #2: 0x0000000100607a0c php`_emalloc(size=200, __zend_filename="Zend/zend_string.h", __zend_lineno=176, __zend_orig_filename=0x0000000000000000, __zend_orig_lineno=0) at zend_alloc.c:2583:9
    frame #3: 0x00000001005f4640 php`zend_string_alloc(len=172, persistent=false) at zend_string.h:176:36
    frame #4: 0x00000001005d0774 php`zend_string_init(str="/**\n     * get a single option value\n     * return property value if property is set.\n     *\n     * @param $optionName\n     * @return bool\n     * @throws \\Exception\n     */\n    public function getToolkitServiceParam($optionName)\n    {\n        // special case, case sensitivity\n        if ($optionName == 'InternalKey') {\n            $optionName = 'internalKey';\n        }\n\n        // we use array_key_exists() rather than isset() because the value may be null.\n        if (array_key_exists($optionName, $this->_options)) {\n            return $this->_options[$optionName];\n        } else {\n            // nothing matched\n            Throw new \\Exception(\"Invalid option requested: $optionName\");\n        }\n    }\n\n    /**\n     * end job if private job (internal key set); end DB transport if not persistent.\n     */\n    public function disconnect()\n    {\n        // if stateful connection, end the toolkit job.\n        if (!$this->isStateless()) {\n            $this->PgmCall(\"OFF\", NULL);\n        }\n\n        // if transport is"..., len=172, persistent=false) at zend_string.h:198:21
    frame #5: 0x00000001005d6df0 php`lex_scan(zendlval=0x000000016fdfcae8, elem=0x000000016fdfce48) at zend_language_scanner.l:2441:21
    frame #6: 0x000000010060fd54 php`zendlex(elem=0x000000016fdfce48) at zend_compile.c:1980:8
    frame #7: 0x00000001005c7da4 php`zendparse at zend_language_parser.c:5154:16
    frame #8: 0x00000001005d1114 php`zend_compile(type=2) at zend_language_scanner.l:599:7
    frame #9: 0x00000001005d10a0 php`compile_file(file_handle=0x000000016fdfdae0, type=2) at zend_language_scanner.l:653:14
    frame #10: 0x00000001003327cc php`phar_compile_file(file_handle=0x000000016fdfdae0, type=2) at phar.c:3349:9
    frame #11: 0x00000001005d18a0 php`compile_filename(type=2, filename=0x000000010189a6e0) at zend_language_scanner.l:704:11
    frame #12: 0x00000001007633c8 php`zend_include_or_eval(inc_filename_zv=0x00000001018173a0, type=2) at zend_execute.c:4938:19
    frame #13: 0x00000001006f6870 php`ZEND_INCLUDE_OR_EVAL_SPEC_CV_HANDLER(execute_data=0x0000000101817350) at zend_vm_execute.h:39967:17
    frame #14: 0x00000001006a1d4c php`execute_ex(ex=0x0000000101817280) at zend_vm_execute.h:57144:7
    frame #15: 0x0000000100639d28 php`zend_call_function(fci=0x000000016fdfdf68, fci_cache=0x000000016fdfdf40) at zend_execute_API.c:971:3
    frame #16: 0x000000010063a930 php`zend_call_known_function(fn=0x0000000101807b58, object=0x000000010187d600, called_scope=0x00000001018064b0, retval_ptr=0x0000000000000000, param_count=1, params=0x000000016fdfe060, named_params=0x0000000000000000) at zend_execute_API.c:1065:23
    frame #17: 0x00000001003bfd24 php`spl_perform_autoload(class_name=0x00000001018890a0, lc_name=0x000000010189eaa0) at php_spl.c:448:3
    frame #18: 0x000000010063b2c0 php`zend_lookup_class_ex(name=0x00000001018890a0, key=0x000000010189eaa0, flags=512) at zend_execute_API.c:1231:7
    frame #19: 0x000000010063c360 php`zend_fetch_class_by_name(class_name=0x00000001018890a0, key=0x000000010189eaa0, fetch_type=512) at zend_execute_API.c:1741:25
    frame #20: 0x00000001006eff60 php`ZEND_NEW_SPEC_CONST_UNUSED_HANDLER(execute_data=0x00000001018171c0) at zend_vm_execute.h:10640:9
    frame #21: 0x00000001006a1d4c php`execute_ex(ex=0x0000000101817020) at zend_vm_execute.h:57144:7
    frame #22: 0x00000001006a2148 php`zend_execute(op_array=0x000000010188a000, return_value=0x0000000000000000) at zend_vm_execute.h:62776:2
    frame #23: 0x000000010065a9d0 php`zend_execute_script(type=8, retval=0x0000000000000000, file_handle=0x000000016fdfecc0) at zend.c:1888:3
    frame #24: 0x0000000100585c0c php`php_execute_script_ex(primary_file=0x000000016fdfecc0, retval=0x0000000000000000) at main.c:2507:13
    frame #25: 0x0000000100585e64 php`php_execute_script(primary_file=0x000000016fdfecc0) at main.c:2547:9
    frame #26: 0x00000001008541cc php`do_cli(argc=2, argv=0x00006000025a13c0) at php_cli.c:966:5
    frame #27: 0x0000000100853300 php`main(argc=2, argv=0x00006000025a13c0) at php_cli.c:1340:18
    frame #28: 0x000000018ae990e0 dyld`start + 2360

without zend_alloc:

% USE_ZEND_ALLOC=0 ~/src/php-src/sapi/cli/php test.php

Warning: odbc_connect(): SQL error: Failed to fetch error message, SQL state HY000 in SQLConnect in /Users/calvin/src/toolkit-user/test.php on line 13
php(48201,0x1e1aedc40) malloc: Heap corruption detected, free list is damaged at 0x6000032d9340
*** Incorrect guard value: 256893118747455
php(48201,0x1e1aedc40) malloc: *** set a breakpoint in malloc_error_break to debug
zsh: abort      USE_ZEND_ALLOC=0 ~/src/php-src/sapi/cli/php test.php

(My configuration invocation for this is ./buildconf && ./configure --with-unixODBC --with-iconv=/opt/local --with-openssl --with-zlib --enable-debug && make -j10.)

Sample program:

<?php

// composer require zendtech/ibmitoolkit
require("vendor/autoload.php");

use ToolkitApi\Toolkit;

// set $cs/$u/$p to some system with i and the i ODBC driver, i.e.
$cs = "Driver=IBM i Access ODBC Driver;System=pub400.com;CCSID=1208";
$u = "";
$p = "";

$odbc = odbc_connect($cs, $u, $p);
var_dump($odbc);

$tkobj = ToolkitService::getInstance($odbc, "", "", "odbc");
$tkobj->setOptions(array('stateless' => true));
$result = $tkobj->ClInteractiveCommand('DSPLIBL');
print_r($result);

@NattyNarwhal
Copy link
Member

@kocsismate
Copy link
Member Author

kocsismate commented Mar 28, 2024

@NattyNarwhal thank you for your help! Fortunately, I could also reproduce the failures on my M3 Mac, and I have already managed to fix most of the problems locally, but ASAN still keeps on bringing up a few more similar issues (double frees mostly). Also, there may have been some conflict resolution issue 🤔 i hope that I can push my changes during the weekend.

@kocsismate
Copy link
Member Author

kocsismate commented Apr 6, 2024

Huh, there were a lot of errors, but now it looks like tests are fixed locally. I happen to notice some leaks when I'm using the debugger, so I'll try to have a look at them as well. I'll leave review comments soon about what and why I had to change.

link->connection = NULL;
ODBCG(num_links)--;
if (!link->persistent) {
ZEND_ASSERT(link->connection && "link has already been closed");
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added this assert in order to be clear when this function can be used.

if (link->persistent) {
zend_hash_del(&EG(persistent_list), link->hash);
} else {
close_results_with_connection(link->connection);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can now close all the results related to the relevant connection before it itself is freed.

zend_string_release(link->hash);
link->hash = NULL;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't use explicit nulls before, but they were useful in order to prevent double frees.

@@ -154,6 +172,7 @@ static void odbc_result_free(odbc_result *res)
}
efree(res->values);
res->values = NULL;
res->numcols = 0;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm zeroing this member just in case

odbc_result *result = odbc_result_from_obj(obj);

if (!result->stmt) {
if (!result->conn_ptr) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the conn_ptr member is checked whether a result is open (CHECK_ODBC_RESULT(result);), I started to use it here as well.

SQLAllocConnect(link->connection->henv, link->connection->hdbc);
link->hash = zend_string_copy(hash);
ret = SQLAllocEnv(&((*link->connection).henv));
if (ret != SQL_SUCCESS && ret != SQL_SUCCESS_WITH_INFO) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added this check when debugging the above mentioned issue. I think we could keep not checking the error condition, and then an error would surface later on anyway, but I guess it's better to catch it as early as possible. I can remove this though if you think it's not needed.

return false;
}

ret = SQLAllocConnect((*link->connection).henv, &((*link->connection).hdbc));
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the same here as above

@@ -2358,9 +2398,9 @@ void odbc_do_connect(INTERNAL_FUNCTION_PARAMETERS, int persistent)
RETURN_FALSE;
}
ODBCG(num_links)++;
zend_hash_update_ptr(&ODBCG(connections), hashed_details.s, Z_ODBC_LINK_P(return_value));
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Accidentally, this line used to be in the wrong place a few lines below so that even persistent connections ended up being in the (non-persistent) connection list. :/

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why update? What about the old value?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also you should document in the header this hashtable does not take ownership of the connection, rather it is a borrow.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've just realized the bug here too when answering your other comment 🙂 I'll fix this for sure. Although the previous code didn't use connection reuse, but now we should in order not to have to take care of hash collision. I'm uncertain whether this change would cause any problems for people

@@ -2378,12 +2418,6 @@ PHP_FUNCTION(odbc_close)
link = Z_ODBC_LINK_P(pv_conn);
CHECK_ODBC_CONNECTION(link->connection);

if (link->persistent) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this block of code is not needed anymore since odbc_link_free() makes sure to remove any trace of the connection


if (link->persistent) {
zend_hash_del(&EG(persistent_list), link->hash);
} else {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_close_odbc_pconn() will free most of the stuff when the resource is freed, so we have to avoid doing so again here, that's why I moved these pieces of code in the else block.

Copy link
Member

@nielsdos nielsdos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is my review so far, and I think you can already fix some leaks with this.
I stopped looking for the moment being. All in all, to be honest, I find the code quite messy (not really your fault ofc, this is the problem with legacy code) with insufficient comments to say why something is the way it is instead of what. In particular, there is no clearly defined model of which structure holds strong references to data vs only borrows data.

HashTable *resource_list;
HashTable *resource_plist;
HashTable results;
HashTable connections;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ODBCG(connections) is used to store non-persistent connections, so they don't survive across requests, so they should be initialized in RINIT and cleaned up in RSHUTDOWN. That table should also be called non_persistent_connections btw, otherwise it's confusing for reviewers.
Creating a persistent table and then storing non-persistent stuff in it is icky.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a side question, why do we hold such a reference at all? I know this is pre-existing, but I think it makes sense to clean up non-persistent connections as soon as the connection object becomes unreachable instead of waiting until the request ends, no? I do see that odbc_connect closes both persistent and non-persistent connections, but I find the fact that connections are held even if the connection object isn't reachable from userland very very weird.
The concept just makes everything more difficult than it has to be imho.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I originally put it there AFAIR but I changed for GSHUTDOWN after this review comment: #12040 (comment) I even looked up https://www.phpinternalsbook.com/php7/extensions_design/php_lifecycle.html#globals-termination-gshutdown and thought that GSHUTDOWN is called during RSHUTDOWN assuming that these globals should be freed there, however, I missed this sentence at the very end of the section: Remember that globals are not cleared after every request; aka GSHUTDOWN() is not called as part of RSHUTDOWN(). So you are surely right, I'll use RSHUTDOWN then.

Copy link
Member Author

@kocsismate kocsismate Apr 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a side question, why do we hold such a reference at all? I know this is pre-existing, but I think it makes sense to clean up non-persistent connections as soon as the connection object becomes unreachable instead of waiting until the request ends, no?

The connections hash table is needed because there's a odbc_close_all() function which closes all connections right away. The results hash table is needed because results are automatically closed as soon as a connection is closed... And I agree with you, I also think that odbc_close_all() is probably the main source of complexity in this extension... Also, by storing connections in a hash map, it's possible to implement connection reuse if we want to (it is already available for persistent resources).

static zend_object_handlers odbc_connection_object_handlers, odbc_result_object_handlers;

static void odbc_insert_new_result(zval *result) {
Z_TRY_ADDREF_P(result);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we use Z_TRY_ADDREF_P in odbc_insert_new_result instead of Z_ADDREF_P? That's not necessary because it'll always be an object, that would be good to assert btw.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no specific reason TBH, it's copy-pasted from other places where Z_TRY_ADDREF_P was used. I'll add the assert as well

ext/odbc/php_odbc.c Outdated Show resolved Hide resolved
@@ -2358,9 +2398,9 @@ void odbc_do_connect(INTERNAL_FUNCTION_PARAMETERS, int persistent)
RETURN_FALSE;
}
ODBCG(num_links)++;
zend_hash_update_ptr(&ODBCG(connections), hashed_details.s, Z_ODBC_LINK_P(return_value));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why update? What about the old value?

}
} ZEND_HASH_FOREACH_END();

/* Loop through the results list searching for any dangling results and close all statements */
ZEND_HASH_FOREACH_VAL(&ODBCG(results), zv) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We only need this for persistent connections right? Please make this clear in the comments.

Copy link
Member Author

@kocsismate kocsismate Apr 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not exactly sure what you mean here, but results may have either a persistent or a non-persistent connection

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't odbc_link_free closing the results via close_results_with_connection? In that case, this loop only affects persistent connections.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it does, but so does _close_odbc_pconn(). So the point is rather to close any remaining results which were omitted by any reason. Or it doesn't make sense to do so?

@@ -2358,9 +2398,9 @@ void odbc_do_connect(INTERNAL_FUNCTION_PARAMETERS, int persistent)
RETURN_FALSE;
}
ODBCG(num_links)++;
zend_hash_update_ptr(&ODBCG(connections), hashed_details.s, Z_ODBC_LINK_P(return_value));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also you should document in the header this hashtable does not take ownership of the connection, rather it is a borrow.

ext/odbc/php_odbc.c Show resolved Hide resolved
Comment on lines 133 to 146
if (!link->connection) {
zend_object_std_dtor(&link->std);
return;
}

odbc_link_free(link);
zend_object_std_dtor(&link->std);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is the more logical way of writing this, at least it is clearer to me:

Suggested change
if (!link->connection) {
zend_object_std_dtor(&link->std);
return;
}
odbc_link_free(link);
zend_object_std_dtor(&link->std);
if (link->connection) {
odbc_link_free(link);
}
zend_object_std_dtor(obj);

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine going with your preference, it makes sense to me. P.S: I implemented this the current way because I adopted Nikita's style (d96219c), but I've just realized that I did the same way as you mentioned in case of pgsql:

if (link->conn) {

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Early-return style is fine and I use it myself. In this particular case however, I found it adds more noise because both paths are quite simple, so by not using early-return style the code becomes easier to read.

ZEND_HASH_FOREACH_NUM_KEY_VAL(&ODBCG(results), num_index, p) {
odbc_result *res = Z_ODBC_RESULT_P(p);
if (res->stmt == result->stmt) {
zend_hash_index_del(&ODBCG(results), num_index);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This note is for all occurrences of list modification during a foreach: I would need to check if it is actually always possible to delete during a loop. I remember fixing a weird data corruption issue caused by deleting inside foreach in ext-soap.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I was wondering about the same when implementing these foreach based modifications. I'll try to look your changes up and fix accordingly. Since a statement should only be present once in the hash table, I guess the fix is going to be easy.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it only occurs once, then you can the removal it in the loop, but use a break; when the item is removed.
This is what I was referring to: 7e4a323
Although it might be fine in your case as the hashtable is not resized.

ext/odbc/php_odbc.c Outdated Show resolved Hide resolved
return NULL;
zend_hash_next_index_insert_new(connection->results, result);
odbc_result *res = Z_ODBC_RESULT_P(result);
res->index = zend_hash_num_elements(connection->results) - 1;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I now store the index so that it's faster to remove the item from the hashtable.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't right.
It's good to store an index, but the way you assign this is wrong.
Let's say we have three results in the table with indices 0, 1, and 2.
Now we insert an extra result, it gets index 3 because num_elements-1==3. That works.

But now we remove index 0. In the table we have: 1, 2, 3.
num_elements-1==3, so now we have a problem. It should've been index 4.

You should probably be using nNextFreeElement

return &intern->std;
}
static void odbc_insert_new_result(odbc_connection *connection, zval *result) {
ZEND_ASSERT(Z_TYPE_P(result) == IS_OBJECT && instanceof_function(Z_OBJCE_P(result), odbc_result_ce));
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added the assert for the result type check

}

static void odbc_link_free(odbc_link *link)
{
ZEND_ASSERT(link->connection && "link has already been closed");

if (link->persistent) {
zend_hash_del(&EG(persistent_list), link->hash);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realized that the persistent list only stores the connections themselves so these should only be freed when a connection/all connections are explicitly closed, or during module shutdown. AFAIR I still need to implement freeing persistent connections for odbc_close(). It's done for odbc_close_all() though.

ext/odbc/php_odbc.c Outdated Show resolved Hide resolved
ext/odbc/php_odbc.c Outdated Show resolved Hide resolved
ext/odbc/php_odbc.c Show resolved Hide resolved
ext/odbc/php_odbc.c Outdated Show resolved Hide resolved
ext/odbc/php_odbc.c Show resolved Hide resolved
ext/odbc/php_odbc.c Show resolved Hide resolved
static void odbc_insert_new_result(odbc_connection *connection, zval *result) {
ZEND_ASSERT(Z_TYPE_P(result) == IS_OBJECT && instanceof_function(Z_OBJCE_P(result), odbc_result_ce));
if (!connection->results) {
connection->results = zend_new_array(1);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How beneficial is it to only allocate this array lazily? I assume that if you create a connection, you probably want to execute some queries.
If you agree, I'd choose to not lazily allocate this, and instead of having a HashTable* have a HashTable this will simplify some code as some NULL checks become irrelevant.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I was also wondering exactly about this, whether I should rather use a HashTable value, since there's a high possibility that a connection will have a result. Thanks for pointing this out!

ext/odbc/php_odbc.c Outdated Show resolved Hide resolved
zend_hash_apply_with_argument(&EG(persistent_list),
_close_pconn_with_res, (void *)p);
}
/* Loop through the non-persistent connection list, now close all non-persistent connections and their results */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment such be above the ZEND_HASH_FOREACH_VAL(&ODBCG(non_persistent_connections), zv) { loop.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also normally I'd say to combine the clean and the loop using ZEND_HASH_FOREACH_END_DEL at the end, but see #8671 (comment). I guess this is something to keep in mind for a follow-up cleanup.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for letting me know about ZEND_HASH_FOREACH_END_DEL()! I didn't know about it. I'll wait until the fix comes :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, it seems that forward iteration wasn't really intended to work with END_DEL, and while we could use reverse iteration that would be a bit silly. So it's fine to keep this as-is.

ext/odbc/php_odbc.c Show resolved Hide resolved
ext/odbc/php_odbc.c Outdated Show resolved Hide resolved
return NULL;
zend_hash_next_index_insert_new(connection->results, result);
odbc_result *res = Z_ODBC_RESULT_P(result);
res->index = zend_hash_num_elements(connection->results) - 1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't right.
It's good to store an index, but the way you assign this is wrong.
Let's say we have three results in the table with indices 0, 1, and 2.
Now we insert an extra result, it gets index 3 because num_elements-1==3. That works.

But now we remove index 0. In the table we have: 1, 2, 3.
num_elements-1==3, so now we have a problem. It should've been index 4.

You should probably be using nNextFreeElement

ext/odbc/php_odbc.c Outdated Show resolved Hide resolved
ext/odbc/php_odbc.c Show resolved Hide resolved
ext/odbc/odbc.stub.php Outdated Show resolved Hide resolved
ext/odbc/odbc.stub.php Outdated Show resolved Hide resolved
kocsismate and others added 3 commits April 24, 2024 23:33
1. GC_DELREF isn't enough because the hash table may hold the last
   reference
2. The reference should be deleted inside the odbc_result_free function
   to avoid forgetting to call it in other places.
This makes sure we don't get a performance penalty in release mode
because the release mode build will still perform the call as the
compiler doesn't know instanceof is side-effect-free.
Copy link
Member

@nielsdos nielsdos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed two commits, one to fix leak (check commit description for reasoning), and one more cleanup-oriented commit.
The final leak is due to the connection object for persistent connections not being freed (which I already commented on a few days ago).


/* If aborted via timer expiration, don't try to call any unixODBC function */
if (!(PG(connection_status) & PHP_CONNECTION_TIMEOUT)) {
safe_odbc_disconnect(conn->hdbc);
SQLFreeConnect(conn->hdbc);
SQLFreeEnv(conn->henv);
}
free(conn);
Copy link
Member

@nielsdos nielsdos Apr 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This causes the final leak. You never free the connection object itself for persistent connections.
However, if you do that here, then you will get use-after-frees because there might still be link objects that refer to this.
You should clearly defined in the code via comments what the expected behaviour is of link objects that exist while the connection is already gone (*). And then figure out what to do in that case.

(*) The reason I'm mentioning this is because I find this extension lacking in comments explaining intent and why something is the way it is.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fixes! Yeah, when I was debugging the use-after-free, and I found out that this free "causes" it, so I removed it and then I tried to get rid of the memleak regarding result->std. And sorry for this regression!

Copy link
Member Author

@kocsismate kocsismate Apr 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now I think I managed to come up with a solution. I'm unsure if there is a better one, but this was my best attempt so far. I explained the idea in the header file, but basically I now started to store persistent links in the ODBC global connection hash table. This way I can keep all the connection references. And when a connection/all connections are closed, I can execute odbc_link_free() and _close_odbc_pconn() in the right order so that the persistent connection itself is always accessible inside the former function and only freed in the latter call.

Copy link
Member

@nielsdos nielsdos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your strategy seems fine, only minor comments left and then I think it's good to go.

if (link->hash) {
zend_hash_del(&ODBCG(non_persistent_connections), link->hash);
}
if (link->hash) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Merge this if's body with the if's body at the bottom of this function.

@@ -878,14 +879,14 @@ PHP_FUNCTION(odbc_close_all)
}

/* Loop through the non-persistent connection list, now close all non-persistent connections and their results */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment is inaccurate now: previously this looped over non-persistent connections but now it does both non-persistent and persistent.


if (zend_parse_parameters(ZEND_NUM_ARGS(), "O", &pv_conn, odbc_connection_ce) == FAILURE) {
RETURN_THROWS();
}

link = Z_ODBC_LINK_P(pv_conn);
CHECK_ODBC_CONNECTION(link->connection);
connection = Z_ODBC_CONNECTION_P(pv_conn);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Style nit: I prefer putting the declaration and assignment on the same line, i.e. odbc_connection *connection = Z_ODBC_CONNECTION_P(pv_conn);. This avoids bugs because it limits the scope of a variable such that it's not accidentally used uninitialized or in places it shouldn't be used.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I fixed this in quite some places

return ZEND_HASH_APPLY_KEEP;
}

odbc_connection *list_conn = ((odbc_connection *)(le->ptr));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Style nit:

Suggested change
odbc_connection *list_conn = ((odbc_connection *)(le->ptr));
odbc_connection *list_conn = (odbc_connection *) le->ptr;

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, this entire function can be simplified into:

static int _close_pconn_with_res(zval *zv, void *p)
{
	zend_resource *le = Z_RES_P(zv);

	if (le->ptr == p) {
		return ZEND_HASH_APPLY_REMOVE;
	}

	return ZEND_HASH_APPLY_KEEP;
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, initially I had a similar code which you suggested, but I wanted to debug it, that's why I created the local variables so that I can inspect them. Now that finally I don't have to do debugging, I'm fine to revert these changes

Comment on lines 130 to 135
/* If aborted via timer expiration, don't try to call any unixODBC function */
if (!(PG(connection_status) & PHP_CONNECTION_TIMEOUT)) {
safe_odbc_disconnect(link->connection->hdbc);
SQLFreeConnect(link->connection->hdbc);
SQLFreeEnv(link->connection->henv);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This exact same code (including comment) exists in _close_odbc_pconn, please factor that out into a common function.
I see this is pre-existing so it's okay to do this separately.

res->param_info = NULL;
}

HashTable *tmp = &res->conn_ptr->results;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: unclear variable name, please use results or something alike instead.

@@ -378,6 +502,12 @@ static PHP_GINIT_FUNCTION(odbc)
ZEND_TSRMLS_CACHE_UPDATE();
#endif
odbc_globals->num_persistent = 0;
zend_hash_init(&odbc_globals->connections, 0, NULL, NULL, 1);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Style nit:

Suggested change
zend_hash_init(&odbc_globals->connections, 0, NULL, NULL, 1);
zend_hash_init(&odbc_globals->connections, 0, NULL, NULL, true);

}

static void odbc_connection_free_obj(zend_object *obj)
{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Style nit: You're inconsistent in { placement on new line vs end of line. This happens not only here but for other functions too.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I saw, { on its own line is more prevalent in this file, so I'll go with this style.

object_init_ex(zv, odbc_connection_ce);
link = Z_ODBC_LINK_P(zv);
link->connection = pecalloc(1, sizeof(odbc_connection), persistent);
zend_hash_init(&link->connection->results, 0, NULL, ZVAL_PTR_DTOR, 1);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Style nit:

Suggested change
zend_hash_init(&link->connection->results, 0, NULL, ZVAL_PTR_DTOR, 1);
zend_hash_init(&link->connection->results, 0, NULL, ZVAL_PTR_DTOR, true);

@@ -2190,17 +2299,16 @@ void odbc_do_connect(INTERNAL_FUNCTION_PARAMETERS, int persistent)
persistent = 0;
}

char *hashed_details;
int hashed_details_len = spprintf(&hashed_details, 0, "%d_%s_%s_%s_%s_%d", persistent, ODBC_TYPE, db, uid, pwd, cur_opt);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this is pre-existing, but this should be size_t instead of int.

@kocsismate
Copy link
Member Author

@nielsdos I think I managed to fix all your comments. I added the upgrading notes + renamed the namespace according to Tim's RFC which will likely pass.

@nielsdos
Copy link
Member

@kocsismate arginfo file is outdated, please update and I will review soon :-)

@kocsismate
Copy link
Member Author

Arginfo file is outdated, please update and I will review soon :-)

Yeah, I noticed it around the same time, fix is pushed :)

Copy link
Member

@nielsdos nielsdos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3 minor non-critical remarks, LGTM otherwise. Thanks.


zend_object_std_init(&intern->std, class_type);
object_properties_init(&intern->std, class_type);
intern->std.handlers = &odbc_connection_object_handlers;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line isn't necessary I think because the default object handler is set in the ce.

{
int ret;

ret = SQLDisconnect( handle );
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Merge with the declaration at line 123 please.


zend_object_std_init(&intern->std, class_type);
object_properties_init(&intern->std, class_type);
intern->std.handlers = &odbc_result_object_handlers;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same: it's set in the ce so you don't need to set it here

@kocsismate kocsismate merged commit afd91fb into php:master Apr 28, 2024
9 of 10 checks passed
@kocsismate kocsismate deleted the odbc-resource branch April 28, 2024 13:46
@iluuu1994
Copy link
Member

@kocsismate
Copy link
Member Author

@iluuu1994 I'm trying to find the root cause of it. Thanks for letting me know about it!

@kocsismate
Copy link
Member Author

kocsismate commented Apr 30, 2024

Update: The failures happen when ZEND_RC_DEBUG is enabled:

Assertion failed: ((zval_gc_flags((p)->u.type_info) & ((1<<7)|(1<<8))) != (1<<7)), function zend_gc_delref, file zend_types.h, line 1343.

@nielsdos
Copy link
Member

@kocsismate Data that are persistent but never shared should be marked as GC_MAKE_PERSISTENT_LOCAL.

@kocsismate
Copy link
Member Author

@nielsdos Indeed, thank you, I haven't come across this flag yet :(

@kocsismate
Copy link
Member Author

I've just filed #14094 to attempt to fix the failures

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants