Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Persistent Mux in the extension #18

Closed
jrbasso opened this issue Jan 15, 2014 · 48 comments
Closed

Persistent Mux in the extension #18

jrbasso opened this issue Jan 15, 2014 · 48 comments

Comments

@jrbasso
Copy link

jrbasso commented Jan 15, 2014

To improve even more the performance, you can make create an alias for the Mux class and it be stored in the extension persistent memory.

For web requests, the first request will load from the compiled router file. For subsequent request it can use the persisted object in memory. It will avoid file access to read the compiled file and re-load all the objects for every request.

You can make something like:

$mux = \Pux\MuxCache::load('compiled_router.php', 'app_routes');
// If not stored, include compiled_router and persist it, otherwise just use the persisted object from app_routes alias

\Pux\MuxCache::clear(); // Remove all persisted objects

Something like that. Thoughts?

@c9s
Copy link
Owner

c9s commented Jan 15, 2014

Sounds great, I think we could have PersistentMux or something that load routes from a definition file.

@jrbasso
Copy link
Author

jrbasso commented Jan 15, 2014

That would be good too. How do you imagine the definition file? Some notation language (JSON, YAML, etc), using the PHP Mux classes or a simple array?

@c9s
Copy link
Owner

c9s commented Jan 15, 2014

the definition file could be a simple php script that returns Mux object because we need to compile the patterns for PCRE routes. and we can cache the Mux object (zval pointer) in the persistent memory) theoretically.

@jrbasso
Copy link
Author

jrbasso commented Jan 15, 2014

Right, this was my intention with the load method on my initial example. I mentioned the compiled_router, but it could be anything that returns a Mux instance. It will be lazy loaded, depending if the alias is currently in the persisted hash table.

@c9s
Copy link
Owner

c9s commented Jan 15, 2014

yeah, and a config key can be added for checking the fstat or not. if the filemtime is greater, then we do reload the definition file. that will be easier for development.

@jrbasso
Copy link
Author

jrbasso commented Jan 15, 2014

It would be nice. But for a production environment it would be better if not check the filemtime, similar of the apc.stat behavior.

@c9s
Copy link
Owner

c9s commented Jan 15, 2014

yeah that's right, just like apc.stat.

@c9s
Copy link
Owner

c9s commented Jan 15, 2014

to gain more, we could reduce the function calls to one statement, something like:

$route = Pux\PersistentMux::dispatch('app_routes', 'routes.php', $_SERVER['PATH_INFO']);

and the parameters can be read as "get persistent mux from key 'app_routes', if no, read the definition file from 'routes.php', then dispatch path with PATH_INFO)

@jrbasso
Copy link
Author

jrbasso commented Jan 15, 2014

Awesome! The app will fly.

@c9s
Copy link
Owner

c9s commented Jan 15, 2014

Yeah! I am so excited

@c9s
Copy link
Owner

c9s commented Jan 15, 2014

Rename Pux\PersistentMux to Pux\PersistentDispatcher seems like better?

@jrbasso
Copy link
Author

jrbasso commented Jan 15, 2014

I saw you have Pux\Dispatcher\APCDispatcher, seems this new class can be in the same namespace, maybe Pux\Dispatcher\PersistentDispatcher.

@c9s
Copy link
Owner

c9s commented Jan 15, 2014

yeah right

@c9s
Copy link
Owner

c9s commented Jan 15, 2014

NOTE:

This seems need to allocate a global array (internally) which will be much easier to maintain the hash table.

Also there is a PZVAL_LOCK, which seems like for locking persistent zval.

@c9s
Copy link
Owner

c9s commented Jan 15, 2014

INIT_PZVAL is doing the job, and which is called by MAKE_STD_ZVAL. so the key point is to keep this global array thread safe with TSRM

@c9s
Copy link
Owner

c9s commented Jan 15, 2014

HashTable can be implemented in:

static HashTable pux_mux_hash;

And can be initilized in the MINIT function:

zend_hash_init(&pux_mux_hash, 0, NULL, NULL, 1);

@c9s
Copy link
Owner

c9s commented Jan 15, 2014

currently working on https://github.com/c9s/Pux/tree/persistent-dispatcher branch

@jrbasso
Copy link
Author

jrbasso commented Jan 15, 2014

Damn, this was fast. Nice to see some code coming for that. 👍

@c9s
Copy link
Owner

c9s commented Jan 15, 2014

Wondering the difference between EG(persistent_list) and Global variable management. thoughts?

When using zend_hash_add in PHP_FUNCTION, it shows ht is destorying. not sure what caused this segmentation fault.

@jrbasso
Copy link
Author

jrbasso commented Jan 16, 2014

I am not very good with PHP extensions. I can read and understand the code, I wrote some small tests, but nothing decent.

But, I looked few extensions to see how they manage the persistent connections and seems all them use the EG(persistent_list). Here is how the memcache extension handle the persistent connections https://github.com/php/pecl-caching-memcache/blob/ff0dd92d53ef33f654a564b8f0b55743823f3ad4/memcache.c#L2213
Seems they always use zend_hash_update to insert instead of the zend_hash_add.

@c9s
Copy link
Owner

c9s commented Jan 17, 2014

yeah, seems like the only way to do it is to use the EG(persistent_list), global variable seems don't work for me, although I found date extension used a HashTable pointer in global context, it reset the HashTable per request.

@jrbasso
Copy link
Author

jrbasso commented Jan 17, 2014

As I said before, I am really bad with php extensions, but I started a
branch locally and I am trying to get as much as I can.

When I have some decent code to show I will push to my fork and probably
you can put some comments in what I am doing wrong. So far I just have the
class definition, but nothing is implemented on the dispatch method.

I hope to have something more concrete tomorrow or in the weekend.

@c9s
Copy link
Owner

c9s commented Jan 17, 2014

I've done the store_mux functionality, I think we need to implement a basic function to archive that. please see #24 for more details.

@c9s
Copy link
Owner

c9s commented Jan 17, 2014

So I just got the basic persistent functions work, now we have in-process persistent mux object!

<?php
echo "process: " , getmypid() , "\n";
$mux = pux_fetch_mux('app');
if ( $mux ) {
    echo "Mux Fetched!\n";
} else {
    $mux = new Pux\Mux;
    echo "Mux Created!\n";
    pux_store_mux('app', $mux);
    echo "Mux Stored!\n";
}

Please see if this works for you!

@c9s
Copy link
Owner

c9s commented Jan 17, 2014

Refactored to 88c2c4d, so we can reuse the static functions in our dispatch function.

I think the fastest way to call is:

$route = pux_persistent_dispatch('app', 'routes.php',  $_SERVER['PATH_INFO']);

$route = pux_persistent_dispatch('app', 'routes.php');  // dispatch SERVER['PATH_INFO'] by default, so we don't pass extra parameters.

@c9s
Copy link
Owner

c9s commented Jan 17, 2014

The next thing we need to implement is the "require" mech. currently I found compile_filename or zend_compile_file but they only returns op_array.

@c9s
Copy link
Owner

c9s commented Jan 17, 2014

zend_op_array *compile_filename(int type, zval *filename TSRMLS_DC) is something that we can handle files internally.

@jrbasso
Copy link
Author

jrbasso commented Jan 17, 2014

When I was working on that yesterday I decided to expect a function instead of the string. My plan is call the function and expect a mux instance. Version 2 can improve to load the file.

So, basically my API will looks like:

Pux\Dispatcher\PersistentDispatcher::dispatch('alias', function() { return include 'routes.php'; }, '/foo');

In this case I was wondering in move the function to the last parameter...

Btw, really nice to have the methods in place. 👍

@jrbasso
Copy link
Author

jrbasso commented Jan 17, 2014

Also, having the function allow people to lazy instantiate the mux inline without have a file to it, or even call a function somewhere else. Ie

// Lazy mux instance
Pux\Dispatcher\PersistentDispatcher::dispatch('alias', function() {
  $mux = new Pux\Mux();
  // ...
  return $mux;
}, '/foo');

// Using OO
class MyAppRouting {
  public function routes() {
    // ...
    return $mux;
  }
}

$routing = new MyAppRouting();
Pux\Dispatcher\PersistentDispatcher::dispatch('alias', function() use (&$routing) {
  return $routing->routes();
}, '/foo');

@c9s
Copy link
Owner

c9s commented Jan 17, 2014

@c9s
Copy link
Owner

c9s commented Jan 17, 2014

I think it's fine to also have a OO style interface to do that

To gain the performance, function call is faster. it's because static method call require class entry lookup.

We need to first implement the function, then reuse the function in our static method for our OO style interface.

Thoughts?

@c9s
Copy link
Owner

c9s commented Jan 17, 2014

yeah, I also like the idea of lazy initialization.

@c9s
Copy link
Owner

c9s commented Jan 17, 2014

but If we want to use closure as lazy evaluation, putting the request URI as the third parameter might not be a good choice.

@jrbasso
Copy link
Author

jrbasso commented Jan 17, 2014

Hmmm, little ugly how the include works, but if it is the way, what can we do?!

I am not a fan of functions on extensions. As your extension use everything with classes, I would suggest to keep all in classes with the Pux namespace.

And I agree having the URI as 3rd param will not be nice with callable. So the new signature would be: Pux\Dispatcher\PersistentDispatcher::dispatch($alias, $uri, callbable $func).

@c9s
Copy link
Owner

c9s commented Jan 18, 2014

I guess it's not the ugliest one. =p I pushed some code that does the "require" which is actually "eval" some code.
the actual code is more complex than that one above.

also I would like to provide both OO interface and function interface. users may extend from the dispatcher class to extend more functionality. and function call interface for faster call, which is with a prefix "pux_", so don't worry about the naming conflict.

In your static method call synopsis, I think $alias and $func are related, we should put them together and it will be more intuitive. something like:

Pux\Dispatcher\PersistentDispatcher::dispatch($uri, $alias, callbable $func);

If we want to make $uri as an optional parameter, it can be another method call I think.

@c9s
Copy link
Owner

c9s commented Jan 18, 2014

Done!

pux_persistent_dispatch.

@c9s
Copy link
Owner

c9s commented Jan 18, 2014

After some testing, I found a zend engine problem, that is... even you use pecalloc to alloc a memory for zval. zend gc will recycle any zval after every requests. so that will cause segmentation fault.

@c9s
Copy link
Owner

c9s commented Jan 18, 2014

So.. the solution will be much more complex - serialize the object just like apc do and store it in persistent memory.

@jrbasso
Copy link
Author

jrbasso commented Jan 18, 2014

If I am not wrong, you have to use zval_add_ref to increment the reference count in the zval, avoiding the gc to destroy it.

@c9s
Copy link
Owner

c9s commented Jan 18, 2014

Actually the reference counting is done by Z_ADDREF_P macro.

I've tried, that doesn't work, reference counting only works in the same
context, at the end of request, the garbage collector cleans up all
variables.

On Sun, Jan 19, 2014 at 1:19 AM, Juan Basso notifications@github.comwrote:

If I am not wrong, you have to use zval_add_ref to increment the
reference count in the zval, avoiding the gc to destroy it.


Reply to this email directly or view it on GitHubhttps://github.com//issues/18#issuecomment-32687085
.

Best Regards,

Yo-An Lin

@c9s
Copy link
Owner

c9s commented Jan 18, 2014

I thought so too, but unfortunately it's not. also the hash table in zend object is not persistent. so we need to serialize them.

@c9s
Copy link
Owner

c9s commented Jan 19, 2014

I asked on the php internal mailing list and have some conclusions:

  1. zend object can not be persistent because the zend object store does not support it.
  2. the internal memory is not allocated persistently by default. so at the end of request, all emalloced memory will be freed.
  3. HashTable can be persistent.

@c9s
Copy link
Owner

c9s commented Jan 19, 2014

Joe posted some related links of copying HashTable from APC:

https://github.com/krakjoe/apcu/blob/simplify/apc_cache.c#L1164

https://github.com/krakjoe/apcu/blob/simplify/apc_cache.c#L1309

Perhaps we can copy the HashTable from zend object and store it back when fetching it from persistent_list.

@c9s
Copy link
Owner

c9s commented Jan 19, 2014

@jrbasso
Copy link
Author

jrbasso commented Jan 19, 2014

Interesting. At least has a workaround like APC does :)

@c9s
Copy link
Owner

c9s commented Jan 20, 2014

Just pushed some code c1ff18f to create a persistent hash of the properties.

but still segmentation fault when destroying resource list entry.

@c9s
Copy link
Owner

c9s commented Jan 22, 2014

Alright... I have been stranded by the zend garbage collection... zzz

The new idea comes from my head. just forgot about zval or zend object, let's convert routes into pure C struct and make it persistent, I guess it will be much easier than keeping a zend object or zend array persistent.

Another benefit is that while dispatching, we can simply iterate the lightweight fast linked list without using zend_hash struct and convert these zval(s) in every loop.

Using pure C struct can be even faster. stay tuned. =]

@c9s
Copy link
Owner

c9s commented Jan 22, 2014

Just fixed the memory leak and it works now! :D

Tested the performance and it's around 1.2x~1.6x faster now.

Will open another issue for the OO style persistent dispatch.

@c9s c9s closed this as completed Jan 22, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants