Skip to content

Extension Writing Part III: Resources

bigbes edited this page Nov 13, 2014 · 1 revision

Extension Writing Part III: Resources

  • Introduction
  • Resources
  • Initializing Resources
  • Accepting Resources as Function Parameters
  • Destroying Resources
  • Destroying a Resource by Force
  • Persistent Resources
  • Finding Existing Persistent Resources

Introduction

Up until now, you’ve worked with concepts that are familiar and map easily to userspace analogies. In this tutorial, you’ll dig into the inner workings of a more alien data type – completely opaque in userspace, but with behavior that should ultimately inspire a sense of déjà vu.

Resources

While a PHP zval can represent a wide range of internal data types, one data type that is impossible to represent fully within a script is the pointer. Representing a pointer as a value becomes even more difficult when the structure your pointer references is an opaque typedef. Since there’s no meaningful way to present these complex structures, there’s also no way to act upon them meaningfully using traditional operators. The solution to this problem is to simply refer to the pointer by an essentially arbitrary label called a resource.

In order for the resource’s label to have any kind of meaning to the Zend Engine, its underlying data type must first be registered with PHP. You’ll start out by defining a simple data structure in php_hello.h. You can place it pretty much anywhere but, for the sake of this exercise, put it after the #define statements, and before the PHP_MINIT_FUNCTION declaration. You’re also defining a constant, which will be used for the resource’s name as shown during a call to var_dump().

typedef struct _php_hello_person {
    char *name;
    int name_len;
    long age;
} php_hello_person;
#define PHP_HELLO_PERSON_RES_NAME "Person Data"

Now, open up hello.c and add a true global integer before your ZEND_DECLARE_MODULE_GLOBALS statement:

int le_hello_person;

List entry identifiers (le_*) are one of the very few places where you’ll declare true, honest to goodness global variables within a PHP extension. These values are simply used with a lookup table to associate resource types with their textual names and their destructor methods, so there’s nothing about them that needs to be threadsafe. Your extension will generate a unique number for each resource type it exports during the MINIT phase. Add that to your extension now, by placing the following line at the top of PHP_MINIT_FUNCTION(hello):

le_hello_person = zend_register_list_destructors_ex(NULL, NULL, PHP_HELLO_PERSON_RES_NAME, module_number);

Initializing Resources

Now that you’ve registered your resource, you need to do something with it. Add the following function to hello.c, along with its matching entry in the hello_functions structure, and as a prototype in php_hello.h:

PHP_FUNCTION(hello_person_new)
{
    php_hello_person *person;
    char *name;
    int name_len;
    long age;
    if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "sl", &name, &name_len, &age) == FAILURE) {
        RETURN_FALSE;
    }

    if (name_len < 1) {
        php_error_docref(NULL TSRMLS_CC, E_WARNING, "No name given, person resource not created.");
        RETURN_FALSE;
    }

    if (age < 0 || age > 255) {
        php_error_docref(NULL TSRMLS_CC, E_WARNING, "Nonsense age (%d) given, person resource not created.", age);
        RETURN_FALSE;
    }

    person = emalloc(sizeof(php_hello_person));
    person->name = estrndup(name, name_len);
    person->name_len = name_len;
    person->age = age;

    ZEND_REGISTER_RESOURCE(return_value, person, le_hello_person);
}

Before allocating memory and duplicating data, this function performs a few sanity checks on the data passed into the resource: Was a name provided? Is this person’s age even remotely within the realm of a human lifespan? Of course, anti-senescence research could make the data type for age (and its sanity-checked limits) seem like the Y2K bug someday, but it’s probably safe to assume no-one will be older than 255 anytime soon.

Once the function has satisfied its entrance criteria, it’s all down to allocating some memory and putting the data where it belongs. Lastly, return_value is populated with a newly registered resource. This function doesn’t need to understand the internals of the data struct; it only needs to know what its pointer address is, and what resource type that data is associated with.

Accepting Resources as Function Parameters

From the previous tutorial in this series, you already know how to use zend_parse_parameters() to accept a resource parameter. Now it’s time to apply that to recovering the data that goes with a given resource. Add this next function to your extension:

PHP_FUNCTION(hello_person_greet)
{
    php_hello_person *person;
    zval *zperson;
    if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "r", &zperson) == FAILURE) {
        RETURN_FALSE;
    }

    ZEND_FETCH_RESOURCE(person, php_hello_person*, &zperson, -1, PHP_HELLO_PERSON_RES_NAME, le_hello_person);

    php_printf("Hello ");
    PHPWRITE(person->name, person->name_len);
    php_printf("!\nAccording to my records, you are %d years old.\n", person->age);

    RETURN_TRUE;
}

The important parts of the functionality here should be easy enough to parse. ZEND_FETCH_RESOURCE() wants a variable to drop the pointer value into. It also wants to know what the variable’s internal type should look like, and it needs to know where to get the resource identifier from.

The -1 in this function call is an alternative to using &zperson to identify the resource. If any numeric value is provided here other than -1, the Zend Engine will attempt to use that number to identify the resource rather than the zval* parameter’s data. If the resource passed does not match the resource type specified by the last parameter, an error will be generated using the resource name given in the second to last parameter.

There is more than one way to skin a resource though. In fact the following four code blocks are all effectively identical:

ZEND_FETCH_RESOURCE(person, php_hello_person *, &zperson, -1, PHP_HELLO_PERSON_RES_NAME, le_person_name);
ZEND_FETCH_RESOURCE(person, php_hello_person *, NULL, Z_LVAL_P(zperson), PHP_HELLO_PERSON_RES_NAME, le_person_name);

person = (php_hello_person *) zend_fetch_resource(&zperson TSRMLS_CC, -1, PHP_HELLO_PERSON_RES_NAME, NULL, 1, le_person_name);
ZEND_VERIFY_RESOURCE(person);

person = (php_hello_person *) zend_fetch_resource(&zperson TSRMLS_CC, -1, PHP_HELLO_PERSON_RES_NAME, NULL, 1, le_person_name);
if (!person) {
    RETURN_FALSE;
}

The last couple of forms are useful in situations where you’re not in a PHP_FUNCTION(), and therefore have no return_value to assign; or when it’s perfectly reasonable for the resource type to not match, and simply returning FALSE is not what you want.

However you choose to retrieve your resource data from the parameter, the result is the same. You now have a familiar C struct that can be accessed in exactly the same way as you would any other C program. At this point the struct still ‘belongs’ to the resource variable, so your function shouldn’t free the pointer or change reference counts prior to exiting. So how are resources destroyed?

Destroying Resources

Most PHP functions that create resource parameters have matching functions to free those resources. For example, mysql_connect() has mysql_close(), mysql_query() has mysql_free_result(), fopen() has fclose(), and so on and so forth. Experience has probably taught you that if you simply unset() variables containing resource values, then whatever real resource they’re attached to will also be freed/closed. For example:

<?php
    $fp = fopen('foo.txt','w');
    unset($fp);

The first line of this snippet opens a file for writing, foo.txt, and assigns the stream resource to the variable $fp. When the second line unsets $fp, PHP automatically closes the file – even though fclose() was never called. How does it do that?

The secret lies in the zend_register_resource() call you made in your MINIT function. The two NULL parameters you passed correspond to cleanup (or dtor) functions. The first is called for ordinary resources, and the second for persistent ones. We’ll focus on ordinary resources for now and come back to persistent resources later on, but the general semantics are the same. Modify your zend_register_resource line as follows:

le_hello_person = zend_register_list_destructors_ex(php_hello_person_dtor, NULL, PHP_HELLO_PERSON_RES_NAME, module_number);

and create a new function located just above the MINIT method:

static void php_hello_person_dtor(zend_rsrc_list_entry *rsrc TSRMLS_DC)
{
    php_hello_person *person = (php_hello_person*)rsrc->ptr;
    if (person) {
        if (person->name) {
            efree(person->name);
        }
        efree(person);
    }
}

As you can see, this simply frees any allocated buffers associated with the resource. When the last userspace variable containing a reference to your resource goes out of scope, this function will be automatically called so that your extension can free memory, disconnect from remote hosts, or perform other last minute cleanup.

Destroying a Resource by Force

If calling a resource’s dtor function depends on all the variables pointing to it going out of scope, then how do functions like fclose() or mysql_free_result() manage to perform their job while references to the resource still exist? Before I answer that question, I’d like you to try out the following:

<?php
$fp = fopen('test', 'w');

var_dump($fp);
fclose($fp);
var_dump($fp);

In both calls to var_dump(), you can see the numeric value of the resource number, so you know that a reference to your resource still exists; yet the second call to var_dump() claims the type is ‘unknown’. This is because the resource lookup table which the Zend Engine keeps in memory, no longer contains the file handle to match that number – so any attempt to perform a ZEND_FETCH_RESOURCE() using that number will fail.

fclose(), like so many other resource-based functions, accomplishes this by using zend_list_delete(). Perhaps obviously, perhaps not, this function deletes an item from a list, specifically a resource list. The simplest use of this would be:

PHP_FUNCTION(hello_person_delete)
{
    zval *zperson;
    if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "r", &zperson) == FAILURE) {
        RETURN_FALSE;
    }

    zend_list_delete(Z_LVAL_P(zperson));
    RETURN_TRUE;
}

Of course, this function will destroy any resource type regardless of whether it’s our person resource, a file handle, a MySQL connection, or whatever. In order to avoid causing potential trouble for other extensions and making userspace code harder to debug, it is considered good practice to first verify the resource type. This can be done easily by fetching the resource into a dummy variable using ZEND_FETCH_RESOURCE(). Go ahead and add that to your function, between the zend_parse_parameters() call and zend_list_delete().

Persistent Resources

If you’ve used mysql_pconnect(), popen() or any of the other persistent resource types, then you’ll know that it’s possible for a resource to stick around, not just after all the variables referencing it have gone out of scope, but even after a request completes and a new one begins. These resources are called persistent resources, because they persist throughout the life of the SAPI unless deliberately destroyed.

The two key differences between standard resources and persistent ones are the placement of the dtor function when registering, and the use of pemalloc()rather than emalloc() for data allocation.

Let’s build a version of our the person resource that can remain persistent. Start by adding another zend_register_resource() line to MINIT. Don’t forget to define the le_hello_person_persist variable next to le_hello_person:

PHP_MINIT_FUNCTION(hello)
{
    le_hello_person = zend_register_list_destructor_ex(php_hello_person_dtor, NULL, PHP_HELLO_PERSON_RES_NAME, module_number);
    le_hello_person_persist = zend_register_list_destructor_ex (NULL, php_hello_person_persist_dtor, PHP_HELLO_PERSON_RES_NAME, module_number);
    ...

The basic syntax is the same, but this time you’ve specified the destructor function in the second parameter to zend_register_resource() as opposed to the first. All that really distinguishes one of these from the other is when the dtor function is actually called. A dtor function passed in the first parameter is called with the active request shutdown, while a dtor function passed in the second parameter isn’t called until the module is unloaded during final shutdown.

Since you’ve referenced a new resource dtor function, you’ll need to define it. Adding this familiar looking method to hello.c somewhere above the MINIT function should do the trick:

static void php_hello_person_persist_dtor(zend_rsrc_list_entry *rsrc TSRMLS_DC)
{
    php_hello_person *person = (php_hello_person*)rsrc->ptr;
    if (person) {
        if (person->name) {
            pefree(person->name, 1);
    }
        pefree(person, 1);
    }
}

Now you need a way to instantiate a persistent version of the person resource. The established convention is to create a new function with a ‘p’ prefix in the name. Add this function to your extension:

PHP_FUNCTION(hello_person_pnew)
{
    php_hello_person *person;
    char *name;
    int name_len;
    long age;
    if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "sl", &name, &name_len, &age) == FAILURE) {
        RETURN_FALSE;
    }

    if (name_len < 1) {
        php_error_docref(NULL TSRMLS_CC, E_WARNING, "No name given, person resource not created.");
        RETURN_FALSE;
    }

    if (age < 0 || age > 255) {
        php_error_docref(NULL TSRMLS_CC, E_WARNING, "Nonsense age (%d) given, person resource not created.", age);
        RETURN_FALSE;
    }

    person = pemalloc(sizeof(php_hello_person), 1);
    person->name = pemalloc(name_len + 1, 1);
    memcpy(person->name, name, name_len + 1);
    person->name_len = name_len;
    person->age = age;

    ZEND_REGISTER_RESOURCE(return_value, person, le_hello_person_persist);
}

As you can see, this function differs only slightly from hello_person_new(). In practice, you’ll typically see these kinds of paired userspace functions implemented as wrapper functions around a common core. Take a look through the source at other paired resource creation functions to see how this kind of duplication is avoided.

Now that your extension is creating both types of resources, it needs to be able to handle both types. Fortunately, ZEND_FETCH_RESOURCE has a sister function that is up to the task. Replace your current call to ZEND_FETCH_RESOURCE in hello_person_greet() with the following:

ZEND_FETCH_RESOURCE2(person, php_hello_person*, &zperson, -1, PHP_HELLO_PERSON_RES_NAME , le_hello_person, le_hello_person_persist);

This will load your person variable with appropriate data, regardless of whether or not a persistent resource was passed.

The functions these two FETCH macros call will actually allow you to specify any number of resource types, but it’s rare to need more than two. Just in case,s here’s the last statement rewritten using the base function:

person = (php_hello_person*) zend_fetch_resource(&zperson TSRMLS_CC, -1, PHP_HELLO_PERSON_RES_NAME, NULL, 2, le_hello_person, le_hello_person_persist);
ZEND_VERIFY_RESOURCE(person);

There are two important things to notice here. Firstly, you can see that the FETCH_RESOURCE macros automatically attempt to verify the resource. Expanded out, the ZEND_VERIFY_RESOURCE macro in this case simply translates to:

if (!person) {
    RETURN_FALSE;
}

Of course, you don’t always want your extension function to exit just because a resource couldn’t be fetched, so you can use the real zend_fetch_resource() function to try to fetch the resource type, but then use your own logic to deal with NULL values being returned.

Finding Existing Persistent Resources

A persistent resource is only as good as your ability to reuse it. In order to reuse it, you’ll need somewhere safe to store it. The Zend Engine provides for this through the EG(persistent_list) executor global, a HashTable containing list_entry structures which is normally used internally by the Eengine. Modify hello_person_pnew() according to the following:

PHP_FUNCTION(hello_person_pnew)
{
    php_hello_person *person;
    char *name, *key;
    int name_len, key_len;
    long age;
    list_entry *le, new_le;
    if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "sl", &name, &name_len, &age) == FAILURE) {
        RETURN_FALSE;
    }

    if (name_len < 1) {
        php_error_docref(NULL TSRMLS_CC, E_WARNING, "No name given, person resource not created.");
        RETURN_FALSE;
    }

    if (age < 0 || age > 255) {
        php_error_docref(NULL TSRMLS_CC, E_WARNING, "Nonsense age (%d) given, person resource not created.", age);
        RETURN_FALSE;
    }

    /* Look for an established resource */
    key_len = spprintf(&key, 0, "hello_person_%s_%d\n", name, age);
    if (zend_hash_find(&EG(persistent_list), key, key_len + 1, &le) == SUCCESS) {
        /* An entry for this person already exists */
        ZEND_REGISTER_RESOURCE(return_value, le->ptr, le_hello_person_persist);
        efree(key);
        return;
    }

    /* New person, allocate a structure */
    person = pemalloc(sizeof(php_hello_person), 1);
    person->name = pemalloc(name_len + 1, 1);
    memcpy(person->name, name, name_len + 1);
    person->name_len = name_len;
    person->age = age;

    ZEND_REGISTER_RESOURCE(return_value, person, le_hello_person_persist);

    /* Store a reference in the persistence list */
    new_le.ptr = person;
    new_le.type = le_hello_person_persist;
    zend_hash_add(&EG(persistent_list), key, key_len + 1, &new_le, sizeof(list_entry), NULL);

    efree(key);
}

This version of hello_person_pnew() first checks for an existing php_hello_person structure in the EG(persistent_list) global and, if available, uses that rather than waste time and resources on reallocating it. If it does not exist yet, the function allocates a new structure populated with fresh data and adds that structure to the persistent list instead. Either way, the function leaves you with a new structure registered as a resource within the request.

The persistent list used to store pointers is always local to the current process or thread, so there’s never any concern that two requests might be looking at the same data at the same time. If one process deliberately closes a persistent resource PHP will handle it, removing the reference to that resource from the persistent list so that future invocations don’t try to use the freed data.