Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

An attempt to implemnt "preloading" ability. #3538

Closed
wants to merge 65 commits into from
Closed

Conversation

dstogov
Copy link
Member

@dstogov dstogov commented Sep 18, 2018

On startup, PHP may execute a script defined by opcache.preload configuration directive.
All function and classes loaded by this script became permanent and available to all the following requests.
For example, it's possible to preload almost whole Zend Framework, and save significant time on each request.

This is an unfinished PoC yet.

@dstogov
Copy link
Member Author

dstogov commented Sep 18, 2018

A preload script for Zend Framework

<?php
function _preload($preload, string $pattern = "/\.php$/", array $ignore = []) {
	if (is_array($preload)) {
		foreach ($preload as $path) {
			_preload($path, $pattern, $ignore);
		}
	} else if (is_string($preload)) {
		$path = $preload;
	    if (!in_array($path, $ignore)) {
			if (is_dir($path)) {
				if ($dh = opendir($path)) {
        			while (($file = readdir($dh)) !== false) {
        				if ($file !== "." && $file !== "..") {
		        		    _preload($path . "/" . $file, $pattern, $ignore);
						}
		    	    }
    		    	closedir($dh);
	    		}
			} else if (is_file($path) && preg_match($pattern, $path)) {
				if (!opcache_compile_file($path)) {
					trigger_error("Preloading Failed", E_USER_ERROR);
				}
			}
		}
	}
}

set_include_path(get_include_path() . PATH_SEPARATOR . realpath("/var/www/ZendFramework/library"));
_preload(["/var/www/ZendFramework/library"]);
?>

@dstogov
Copy link
Member Author

dstogov commented Sep 18, 2018

@nikic, @laruence could you please take a look.

Actually, this is not exactly what I liked to do.
PHP still has to perform significant work after each request, restoring preloaded classes and functions into initial state, at accel_deactivate

May be you'll get some related ideas.

@nikic
Copy link
Member

nikic commented Sep 20, 2018

The main idea here is to save the overhead of class binding at run-time, is that right?

I'm wondering how invalidation would be handled for preloads. If one of the (potentially many) files included by the preload is changed, can we detect this efficiently and invalidate the necessary parts (or just everything)?

@dstogov
Copy link
Member Author

dstogov commented Sep 20, 2018

@nikic

The main idea here is to save the overhead of class binding at run-time, is that right?

Not only. The idea is to completely eliminate compilation and opcache overhead (copying from SHM to process memory and insertions into function/class tables on each request). Using this technique, we might write standard functions and classes in PHP (similar to systemlib.php in HHVM).

I'm wondering how invalidation would be handled for preloads. If one of the (potentially many) files included by the preload is changed, can we detect this efficiently and invalidate the necessary parts (or just everything)?

No. The preloaded functions and classes must not be invalidated at all (like internal ones).

Persistent, linked classes would open a way for more aggressive optimizations, especially with conjunction with JIT. Unfortunately, the current implementation still have to copy zend_class_entry from SHM to process memory and this makes overhead and troubles.

Some related work was done in Java World: https://simonis.github.io/JBreak2018/CDS/cds.xhtm http://www.inf.usi.ch/faculty/nystrom/papers/cdn02-ecoop.pdf

@beberlei
Copy link
Contributor

@dstogov this is amazing!

Could you explain how or if this would affect extensions that profile userland functions with regard to a.) hooking into zend_execute_ex or b.) overwrite the function pointers.

@dstogov
Copy link
Member Author

dstogov commented Sep 21, 2018

Could you explain how or if this would affect extensions that profile userland functions with regard to a.) hooking into zend_execute_ex or b.) overwrite the function pointers.

The preloaded op_arrays is not very different from op_arrays restored from opcache. So profilers should work in the same way. However, preloaded op_arrays relive request boundary and should be reset into initial state.

@alex-mashin
Copy link

  1. What would PHP code that checks whether preload feature is available and certain functions and classes are preloaded and defines them if they are not look like? Will it be necessary at all, or the code that tires to overload preloaded functions and classes will be ignored, so that even unmodified existing CMS's will benefit from this feature?
  2. Will the opcache.preload option cause whole directories to be preloaded like the script for Zend framework that you posted above?

@dstogov
Copy link
Member Author

dstogov commented Oct 1, 2018

Note, that currently this is just an idea and implementation is not good enough.

1. What would PHP code that checks whether preload feature is available and certain functions and classes are preloaded and defines them if they are not look like?

It's possible to check preloaded function and classes using function_exists/class exists, but these checks also may change the original logic of some apps (e.g Wordpress execute PHP scripts when some function not defined, but if the function is preloaded this code is not executed and application doesn't work as expected)

Will it be necessary at all, or the code that tires to overload preloaded functions and classes will be ignored, so that even unmodified existing CMS's will benefit from this feature?

Inclusion of preloaded scripts should work out of the box ("redefinitions" are going to be ignored), but redefinition in other scripts will trigger errors (Cannot redeclare ...)

2. Will the opcache.preload option cause whole directories to be preloaded like the script for Zend framework that you posted above?

opcache.prelaod accepts just a PHP script. This script may implement the actual loading in a way you need.

@dstogov
Copy link
Member Author

dstogov commented Oct 4, 2018

@nikic see the next attempt based on immutable classes (first commit). Don't spend a lot of time, just give your attendance, and may be ideas about problems and improvement.

Now it makes 50 times speedup on ZF loading benchmark and ~35% on ZF1 HelloWorld (3700 req/sec vs 2700 req/sec). Although, the implementation is far from completion and there are few serious problems, now it looks mush more promising.

if (op_array->type == ZEND_USER_FUNCTION) {
ZEND_ASSERT(op_array->fn_flags & ZEND_ACC_IMMUTABLE);
if (op_array->static_variables) {
destroy_op_array(op_array);
Copy link
Member

@nikic nikic Oct 5, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be freeing the op_array rather than just the static_variables table in the request_data?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For "immutable" arrays we have to free only static_variables. Everything else kept in SHM, except for run_time cache, that is in CG(arena). Or may be I misunderstood the question.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean, the code here currently frees the op_array, but the assertion above says that this is an immutable op_array, so I would expect it to only free static_variables.

@@ -2425,6 +2486,152 @@ int accel_post_deactivate(void)
return SUCCESS;
}

/* Free static variables of preloaded classes and functions */
if (ZCSG(preload_script)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cleanup that is done here for immutable functions/classes, can all of it be skipped for fast shutdown? I.e. just zeroing the request data should be sufficient?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You must be right, but current implementation also may preload non-immutable classes, and we have to deal with them. I'll think how to solve this.

if (ce->num_interfaces) {
found = 1;
for (i = 0; i < ce->num_interfaces; i++) {
p = zend_hash_find_ptr(EG(class_table), ce->interface_names[i].lc_name);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does EG(class_table) also include internal classes at this point? If so, I'm wondering how this will be handled on Windows where IIRC multiple processes (with internal classes at different addrs) may attach to one shm instance.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I know about this problem, but prefer to postpone it. e.g. don't support this feature on Windows or copy internal classes into SHM.

@nikic
Copy link
Member

nikic commented Oct 5, 2018

@dstogov I think the new approach looks reasonable. What I slightly dislike is that we now have two different code paths for the immutable and the non-immutable cases for runtime cache, static vars and static props. I'm wondering if it would make sense to make all functions/classes (post-link) immutable to make the way they are handled uniform. This would require making the request_data handling more dynamic (something similar to the object store: dynamic resizing + basic free list).

@dstogov
Copy link
Member Author

dstogov commented Oct 5, 2018

@nikic thanks for review. Your thoughts about non-uniform handling make sense. I'll try to solve all implementation problems first, and then return to this question. May be replace index access through CG(request_data), by indirect pointer access (through special MAP region for pointers from SHM to process memory. This is what HotSpot does)

#endif
if (!ZCG(mem)) {
zend_shared_alloc_unlock();
zend_accel_error(ACCEL_LOG_FATAL, "No enogh shared memory for preloading!");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No enough shared memory for preloading

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@raziel057 No > Not

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then maybe
Not enough shared memory for preloading

😁

@zbenc
Copy link

zbenc commented Oct 10, 2018

@dstogov Thanks for working on this!
Would this enable static inlining of PHP functions within other preloaded files, e.g. helper functions from one file getting inlined at the call sites in other files?
That would be one more great benefit of preloading IMO.

@dstogov
Copy link
Member Author

dstogov commented Oct 10, 2018

In general, immutable classes/functions and preloading would simplify various "Whole Program Optimization" methods, including class hierarchy analysis and inner script inlining. On the other hand, dynamic nature of PHP and defined by the language life-time of variables + destructors makes a lot of troubles. Currently, I think just about preloading, optimization is the future step.

@@ -84,6 +84,9 @@ typedef int gid_t;
#include <immintrin.h>
#endif

// TODO: remove ???
#include <sys/mman.h>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know its a WIP but this part (along with the mmap call) won't work for Windows and it needs to be using the File Mapping API (hence the AppVeyor fail)

@php-pulls
Copy link

Comment on behalf of petk at php.net:

Labelling

@dstogov
Copy link
Member Author

dstogov commented Oct 12, 2018

@nikic please take a quick look into the next iteration. It's still not finished, but now immutable classes are supported in ZTS build and on Windows. I non-ZTS build the patch makes improvement even without preloading, because immutable classes are not copied into process memory on each request.
I hope, I have to solve only few edge cases (file_cache, opcache restart, preloading in ZTS and Windows). We will also need new API instead of run-time op_array->reserved[] access of immutable functions.(I'm going to reserve space in op_array->run_time_cache).
Do you see any other problems?

Copy link
Member

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked only at the Zend part (not opcache). I think the approach makes sense and I don't see any immediate problems.

I think it would make sense to separate out the introduction of the map ptr mechanism and all the related changes and land them before the preloading functionality, as they are mostly independent and preloading will probably need more work before it's practically usable.

Zend/zend.c Outdated

if (CG(map_ptr_last) >= CG(map_ptr_size)) {
#if ZEND_MAP_PTR_KIND == ZEND_MAP_PTR_KIND_PTR
// TODO: error ???
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not totally sure about the significance of the initial buffer size. Is this effectively the limit on (approximately) the number of functions that can be preloaded by opcache (but not how many can be loaded at runtime, as they don't go through this map)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. This is how many pointers from opcache SHM to PROCESS we may have.

Zend/zend_API.c Outdated
}
ZEND_MAP_PTR_NEW(ce->static_members_table);
#else
ZEND_MAP_PTR_INIT(ce->static_members_table, pecalloc(1, sizeof(void*), 1));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the ZTS/NTS distinction here necessary?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With NTS we may avoid MAP slot allocation, but, probably, you are right, and it makes sense to try unifying this.


static zend_always_inline zend_class_iterator_funcs* zend_get_iterator_funcs_ptr(zend_class_entry *ce) /* {{{ */
{
if (ZEND_MAP_PTR_GET(ce->iterator_funcs_ptr)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this use ZEND_MAP_PTR?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. Immutable classes are going to have ZEND_MAP_PTR set (pointer to the MAP region), but initially ZEND_MAP_PTR_GET is going to be NULL.

my_function.op_array.fn_flags |= ZEND_ACC_HEAP_RT_CACHE;
memset(my_function.op_array.run_time_cache, 0, my_function.op_array.cache_size);
ptr = emalloc(sizeof(void*) + my_function.op_array.cache_size);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may be worthwhile to add a helper for this type of allocation (for request and arena), which has an additional prefixed indirection pointer. It's used a couple of times here and also for the iterator funcs.

@dstogov
Copy link
Member Author

dstogov commented Oct 12, 2018

@nikic thanks for review. I completely agree, about separation of immutable+map_ptr and preloading. This is what I tried to do in first two commits. I hope, I'll able to deliver first part on next week.

@pmmaga
Copy link
Contributor

pmmaga commented Nov 8, 2018

Hi, currently, trying to include a composer generated autoloader in the preload script is failing an assertion (debug maintainer-zts):

[Thu Nov  8 10:54:40 2018]  Script:  '-'
/home/pedro/dev/php/php-src/Zend/zend_constants.c(477) :  Freeing 0x00007f92cfc03038 (24 bytes), script=-
Last leak repeated 2 times
[Thu Nov  8 10:54:40 2018]  Script:  '-'
/home/pedro/dev/php/php-src/Zend/zend_objects.c(180) :  Freeing 0x00007f92cfc0b600 (200 bytes), script=-
[Thu Nov  8 10:54:40 2018]  Script:  '-'
/home/pedro/dev/php/php-src/Zend/zend_string.h(132) :  Freeing 0x00007f92cfcbc1e0 (40 bytes), script=-
Last leak repeated 2 times
=== Total 7 memory leaks detected ===
php: /home/pedro/dev/php/php-src/Zend/zend_constants.c:61: copy_zend_constant: Assertion `(((c)->value).u2.constant_flags & 0xff) & (1<<1)' failed.
Aborted

This is the autoloader generated by cloning the latest master of composer, running composer install and then trying to include the vendor/autoload.php.
FWIW, a simple autoloader works:

<?php

spl_autoload_register(function($name) {
    include_once("$name.php");
});

use Foo;

Let me know if you need any additional info.

@dstogov
Copy link
Member Author

dstogov commented Nov 8, 2018

composer/composer#7777 (comment)

Another question to clarify memory consumption

I also mean it works that way, persistent class opcodes does not copied into process memory from SHM, but shared between fpm processes

Could you confirm that, please?

Right.

@dstogov
Copy link
Member Author

dstogov commented Nov 8, 2018

OK, so using include can be safe

I tried out php.ini's auto_prepend_file directive

auto_prepend_file = pre.php

bool(true)
bool(true)
bool(true)
bool(true)

So persistent classes available in auto_prepend_file too.

It makes sense to prohibit auto_prepend/append for preloading. I'll fix this.

@friizer
Copy link

friizer commented Nov 8, 2018

Is there a problem with that, or why do you want to prohibit?

I dont have a problem with it, just checked to make sure it works too

@dstogov
Copy link
Member Author

dstogov commented Nov 8, 2018

ops. Everything is fine. I thought prepend file was preloaded, but this is not true.

@dstogov
Copy link
Member Author

dstogov commented Nov 8, 2018

Hi, currently, trying to include a composer generated autoloader in the preload script is failing an assertion (debug maintainer-zts):

[Thu Nov  8 10:54:40 2018]  Script:  '-'
/home/pedro/dev/php/php-src/Zend/zend_constants.c(477) :  Freeing 0x00007f92cfc03038 (24 bytes), script=-
Last leak repeated 2 times
[Thu Nov  8 10:54:40 2018]  Script:  '-'
/home/pedro/dev/php/php-src/Zend/zend_objects.c(180) :  Freeing 0x00007f92cfc0b600 (200 bytes), script=-
[Thu Nov  8 10:54:40 2018]  Script:  '-'
/home/pedro/dev/php/php-src/Zend/zend_string.h(132) :  Freeing 0x00007f92cfcbc1e0 (40 bytes), script=-
Last leak repeated 2 times
=== Total 7 memory leaks detected ===
php: /home/pedro/dev/php/php-src/Zend/zend_constants.c:61: copy_zend_constant: Assertion `(((c)->value).u2.constant_flags & 0xff) & (1<<1)' failed.
Aborted

This is the autoloader generated by cloning the latest master of composer, running composer install and then trying to include the vendor/autoload.php.
FWIW, a simple autoloader works:

<?php

spl_autoload_register(function($name) {
    include_once("$name.php");
});

use Foo;

Let me know if you need any additional info.

I hope this should be fixed now.

@Tobion
Copy link
Contributor

Tobion commented Nov 8, 2018

I assume the preloading does not have any affect in CLI?
Do these settings affect preloading in any way, esp. file_cache_only? Would this disable preloading because there is no shared memory to put the preloaded stuff into? Or is preload not executed in CLI anyway?

opcache.preload = myscript.php
opcache.enable_cli = 1
opcache.file_cache = "/dev/shm/php-opcache"
opcache.file_cache_only = 1

@dstogov
Copy link
Member Author

dstogov commented Nov 9, 2018

I assume the preloading does not have any affect in CLI?
Do these settings affect preloading in any way, esp. file_cache_only? Would this disable preloading because there is no shared memory to put the preloaded stuff into? Or is preload not executed in CLI anyway?

opcache.preload = myscript.php
opcache.enable_cli = 1
opcache.file_cache = "/dev/shm/php-opcache"
opcache.file_cache_only = 1

Preloading works in CLI, as well, but doesn't work without opcache or in file_cache_only mode.

@friizer
Copy link

friizer commented Nov 9, 2018

Thanks,

That fixed last of my errors with preloading

I just run my functional tests against the API server (symfony stack),
Nearly 1000 tests run over one hour, through the full stack, without a single crash

Over 10000 http requests, aprx 4000 transactions, 60000 db queries worked without single problem

@Tobion
Copy link
Contributor

Tobion commented Nov 9, 2018

Preloading works in CLI, as well, but doesn't work without opcache or in file_cache_only mode.

@dstogov what does it mean it works? Does it mean, the preloading is executed and does not fail? But it does not give any performance advantage because every CLI execution would do it's own preloading that cannot be shared?

@dstogov
Copy link
Member Author

dstogov commented Nov 13, 2018

@Tobion preloading doesn't improve performance in CLI, but anyway, it loads file(s) before the main script and makes all the preloaded entities available for this script.

@dstogov
Copy link
Member Author

dstogov commented Nov 14, 2018

merged into master

@dstogov dstogov closed this Nov 14, 2018
@jamespatterson
Copy link

I am currently using a custom PHP implementation for process management, primarily to get the advantage of preloaded code and config data, but also to communicate with worker processes (forks of the master process) using IPC to direct config reloads, preloaded resource reloads, etc. This sounds like I could use FPM for all the preloading, as long as I stop depending on a long-lived process that acts as the controller. Is there any way with FPM to load custom PHP code into the process manager? Or is all this preloading done per-process spawn instead of once per FPM startup?

@dstogov
Copy link
Member Author

dstogov commented Nov 14, 2018

@jamespatterson preloading is done once per main PHP process and available for all forked workers. It helps keeping code between request, but it doesn't keep the state (variables, config data, constructed arrays/object graphs etc).

@csdougliss
Copy link

csdougliss commented Nov 15, 2018

Is there a way to clear the pre-load similar to opcache_clear(); ?

For example, if you deploy a new set of code to your existing running system (I'm thinking Jenkins, to running web server, with changed php code?)

@dstogov
Copy link
Member Author

dstogov commented Nov 15, 2018

@ craigcarnell no, you'll have to restart PHP.

@kocoten1992
Copy link

Hi @dstogov, is this feature work well with service reload ?
https://github.com/php/php-src/blob/master/sapi/fpm/php-fpm.service.in#L13

@nikic
Copy link
Member

nikic commented Jan 19, 2019

@kocoten1992 No, this feature is not compatible with reloads. It requires a full php-fpm restart.

@garygreen
Copy link

garygreen commented Apr 24, 2019

Preloading Limitation

Only classes without unresolved parent, interfaces, traits and constant values may be preloaded.


Can anyone explain what it means by "unresolved parent"?

@dstogov
Copy link
Member Author

dstogov commented Apr 24, 2019

class A extends B {
}

if class with name "B" is not found in preloaded scripts, then A has an "unresolved parent".

@Rookiebrother
Copy link

How should I call
final class A implements Serializable{
Public function __construct()
{
Throw new \Exception('No implemented');
}
}
Thank you

@Tobion
Copy link
Contributor

Tobion commented May 19, 2019

@LittleRookie2015 class_exists(A::class);

@brendt
Copy link
Contributor

brendt commented Jun 27, 2019

I'm sorry if this is not the right place to ask this question, but I couldn't seem to find anything written about this in the RFC.

When exactly will a class be "unlinked"? https://github.com/php/php-src/pull/3538/files#diff-f6d4b551c6233faa294ae9a97897c7d4R3368

@dstogov
Copy link
Member Author

dstogov commented Jun 27, 2019

@brendt
Preloading tries to combine classes from different files and link them together (perform inheritance etc), but this is not always possible, because, some classes may be missed (or compiled conditionally). So, if for some class we didn't find parent class (or some of interfaces or traits) we can't link it.

@brendt
Copy link
Contributor

brendt commented Jun 27, 2019

@dstogov so I'm trying to preload the whole Laravel framework as a test, and getting thousands of these warnings. What's confuses me though, is that when trying to preload a single file, it works.

Here's an example, this is my preload script, it's located in the project root dir.

<?php

class Preloader
{
    private array $paths;

    public function __construct(string ...$paths)
    {
        $this->paths = $paths;
    }

    public function load(): void
    {
        foreach ($this->paths as $path) {
            $this->loadPath($path);

            set_include_path($path);
        }
    }

    protected function loadPath(string $path): void
    {
        if (is_dir($path)) {
            $this->loadDir($path);

            return;
        }

        if (substr($path, -4) !== '.php') {
            return;
        }

        $isPreloaded = opcache_compile_file($path);

        if (! $isPreloaded) {
            trigger_error("Preloading failed", E_USER_ERROR);
        }
    }

    protected function loadDir(string $path): void
    {
        $handle = opendir($path);

        while ($file = readdir($handle)) {
            if (in_array($file, ['.', '..'])) {
                continue;
            }

            $this->loadPath($path . "/" . $file);
        }

        closedir($handle);
    }
}

(new Preloader([
    __DIR__ . '/vendor/laravel',
]))->load();

this script shows thousands of warnings like this one:

php_1    | <b>Warning</b>:  Can't preload unlinked class Illuminate\Database\Eloquent\Relations\HasMany in <b>/shared/httpd/preload/vendor/laravel/framework/src/Illuminate/Database/Eloquent/Relations/HasMany.php</b> on line <b>7</b><br />
php_1    | <br />

However loading, for example, /shared/httpd/preload/vendor/laravel/framework/src/Illuminate/Database/Eloquent/Relations/HasMany.php on its own, works:

(new Preloader(
    __DIR__ . '/vendor/laravel/framework/src/Illuminate/Database/Eloquent/Relations/HasMany.php',
))->load();

@brendt
Copy link
Contributor

brendt commented Jun 28, 2019

A few hours later and I get what's happening: all classes used by a class must be preloaded as well, in order for that class to be prelaodable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet