New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

An attempt to implemnt "preloading" ability. #3538

Open
wants to merge 23 commits into
base: master
from

Conversation

9 participants
@dstogov
Contributor

dstogov commented Sep 18, 2018

On startup, PHP may execute a script defined by opcache.preload configuration directive.
All function and classes loaded by this script became permanent and available to all the following requests.
For example, it's possible to preload almost whole Zend Framework, and save significant time on each request.

This is an unfinished PoC yet.

@dstogov

This comment has been minimized.

Show comment
Hide comment
@dstogov

dstogov Sep 18, 2018

Contributor

A preload script for Zend Framework

<?php
function _preload($preload, string $pattern = "/\.php$/", array $ignore = []) {
	if (is_array($preload)) {
		foreach ($preload as $path) {
			_preload($path, $pattern, $ignore);
		}
	} else if (is_string($preload)) {
		$path = $preload;
	    if (!in_array($path, $ignore)) {
			if (is_dir($path)) {
				if ($dh = opendir($path)) {
        			while (($file = readdir($dh)) !== false) {
        				if ($file !== "." && $file !== "..") {
		        		    _preload($path . "/" . $file, $pattern, $ignore);
						}
		    	    }
    		    	closedir($dh);
	    		}
			} else if (is_file($path) && preg_match($pattern, $path)) {
				if (!opcache_compile_file($path)) {
					trigger_error("Preloading Failed", E_USER_ERROR);
				}
			}
		}
	}
}

set_include_path(get_include_path() . PATH_SEPARATOR . realpath("/var/www/ZendFramework/library"));
_preload(["/var/www/ZendFramework/library"]);
?>
Contributor

dstogov commented Sep 18, 2018

A preload script for Zend Framework

<?php
function _preload($preload, string $pattern = "/\.php$/", array $ignore = []) {
	if (is_array($preload)) {
		foreach ($preload as $path) {
			_preload($path, $pattern, $ignore);
		}
	} else if (is_string($preload)) {
		$path = $preload;
	    if (!in_array($path, $ignore)) {
			if (is_dir($path)) {
				if ($dh = opendir($path)) {
        			while (($file = readdir($dh)) !== false) {
        				if ($file !== "." && $file !== "..") {
		        		    _preload($path . "/" . $file, $pattern, $ignore);
						}
		    	    }
    		    	closedir($dh);
	    		}
			} else if (is_file($path) && preg_match($pattern, $path)) {
				if (!opcache_compile_file($path)) {
					trigger_error("Preloading Failed", E_USER_ERROR);
				}
			}
		}
	}
}

set_include_path(get_include_path() . PATH_SEPARATOR . realpath("/var/www/ZendFramework/library"));
_preload(["/var/www/ZendFramework/library"]);
?>
@dstogov

This comment has been minimized.

Show comment
Hide comment
@dstogov

dstogov Sep 18, 2018

Contributor

@nikic, @laruence could you please take a look.

Actually, this is not exactly what I liked to do.
PHP still has to perform significant work after each request, restoring preloaded classes and functions into initial state, at accel_deactivate

May be you'll get some related ideas.

Contributor

dstogov commented Sep 18, 2018

@nikic, @laruence could you please take a look.

Actually, this is not exactly what I liked to do.
PHP still has to perform significant work after each request, restoring preloaded classes and functions into initial state, at accel_deactivate

May be you'll get some related ideas.

@nikic

This comment has been minimized.

Show comment
Hide comment
@nikic

nikic Sep 20, 2018

Member

The main idea here is to save the overhead of class binding at run-time, is that right?

I'm wondering how invalidation would be handled for preloads. If one of the (potentially many) files included by the preload is changed, can we detect this efficiently and invalidate the necessary parts (or just everything)?

Member

nikic commented Sep 20, 2018

The main idea here is to save the overhead of class binding at run-time, is that right?

I'm wondering how invalidation would be handled for preloads. If one of the (potentially many) files included by the preload is changed, can we detect this efficiently and invalidate the necessary parts (or just everything)?

@dstogov

This comment has been minimized.

Show comment
Hide comment
@dstogov

dstogov Sep 20, 2018

Contributor

@nikic

The main idea here is to save the overhead of class binding at run-time, is that right?

Not only. The idea is to completely eliminate compilation and opcache overhead (copying from SHM to process memory and insertions into function/class tables on each request). Using this technique, we might write standard functions and classes in PHP (similar to systemlib.php in HHVM).

I'm wondering how invalidation would be handled for preloads. If one of the (potentially many) files included by the preload is changed, can we detect this efficiently and invalidate the necessary parts (or just everything)?

No. The preloaded functions and classes must not be invalidated at all (like internal ones).

Persistent, linked classes would open a way for more aggressive optimizations, especially with conjunction with JIT. Unfortunately, the current implementation still have to copy zend_class_entry from SHM to process memory and this makes overhead and troubles.

Some related work was done in Java World: https://simonis.github.io/JBreak2018/CDS/cds.xhtm http://www.inf.usi.ch/faculty/nystrom/papers/cdn02-ecoop.pdf

Contributor

dstogov commented Sep 20, 2018

@nikic

The main idea here is to save the overhead of class binding at run-time, is that right?

Not only. The idea is to completely eliminate compilation and opcache overhead (copying from SHM to process memory and insertions into function/class tables on each request). Using this technique, we might write standard functions and classes in PHP (similar to systemlib.php in HHVM).

I'm wondering how invalidation would be handled for preloads. If one of the (potentially many) files included by the preload is changed, can we detect this efficiently and invalidate the necessary parts (or just everything)?

No. The preloaded functions and classes must not be invalidated at all (like internal ones).

Persistent, linked classes would open a way for more aggressive optimizations, especially with conjunction with JIT. Unfortunately, the current implementation still have to copy zend_class_entry from SHM to process memory and this makes overhead and troubles.

Some related work was done in Java World: https://simonis.github.io/JBreak2018/CDS/cds.xhtm http://www.inf.usi.ch/faculty/nystrom/papers/cdn02-ecoop.pdf

@beberlei

This comment has been minimized.

Show comment
Hide comment
@beberlei

beberlei Sep 20, 2018

@dstogov this is amazing!

Could you explain how or if this would affect extensions that profile userland functions with regard to a.) hooking into zend_execute_ex or b.) overwrite the function pointers.

beberlei commented Sep 20, 2018

@dstogov this is amazing!

Could you explain how or if this would affect extensions that profile userland functions with regard to a.) hooking into zend_execute_ex or b.) overwrite the function pointers.

@dstogov

This comment has been minimized.

Show comment
Hide comment
@dstogov

dstogov Sep 21, 2018

Contributor

Could you explain how or if this would affect extensions that profile userland functions with regard to a.) hooking into zend_execute_ex or b.) overwrite the function pointers.

The preloaded op_arrays is not very different from op_arrays restored from opcache. So profilers should work in the same way. However, preloaded op_arrays relive request boundary and should be reset into initial state.

Contributor

dstogov commented Sep 21, 2018

Could you explain how or if this would affect extensions that profile userland functions with regard to a.) hooking into zend_execute_ex or b.) overwrite the function pointers.

The preloaded op_arrays is not very different from op_arrays restored from opcache. So profilers should work in the same way. However, preloaded op_arrays relive request boundary and should be reset into initial state.

@alex-mashin

This comment has been minimized.

Show comment
Hide comment
@alex-mashin

alex-mashin Oct 1, 2018

  1. What would PHP code that checks whether preload feature is available and certain functions and classes are preloaded and defines them if they are not look like? Will it be necessary at all, or the code that tires to overload preloaded functions and classes will be ignored, so that even unmodified existing CMS's will benefit from this feature?
  2. Will the opcache.preload option cause whole directories to be preloaded like the script for Zend framework that you posted above?

alex-mashin commented Oct 1, 2018

  1. What would PHP code that checks whether preload feature is available and certain functions and classes are preloaded and defines them if they are not look like? Will it be necessary at all, or the code that tires to overload preloaded functions and classes will be ignored, so that even unmodified existing CMS's will benefit from this feature?
  2. Will the opcache.preload option cause whole directories to be preloaded like the script for Zend framework that you posted above?
@dstogov

This comment has been minimized.

Show comment
Hide comment
@dstogov

dstogov Oct 1, 2018

Contributor

Note, that currently this is just an idea and implementation is not good enough.

1. What would PHP code that checks whether preload feature is available and certain functions and classes are preloaded and defines them if they are not look like?

It's possible to check preloaded function and classes using function_exists/class exists, but these checks also may change the original logic of some apps (e.g Wordpress execute PHP scripts when some function not defined, but if the function is preloaded this code is not executed and application doesn't work as expected)

Will it be necessary at all, or the code that tires to overload preloaded functions and classes will be ignored, so that even unmodified existing CMS's will benefit from this feature?

Inclusion of preloaded scripts should work out of the box ("redefinitions" are going to be ignored), but redefinition in other scripts will trigger errors (Cannot redeclare ...)

2. Will the opcache.preload option cause whole directories to be preloaded like the script for Zend framework that you posted above?

opcache.prelaod accepts just a PHP script. This script may implement the actual loading in a way you need.

Contributor

dstogov commented Oct 1, 2018

Note, that currently this is just an idea and implementation is not good enough.

1. What would PHP code that checks whether preload feature is available and certain functions and classes are preloaded and defines them if they are not look like?

It's possible to check preloaded function and classes using function_exists/class exists, but these checks also may change the original logic of some apps (e.g Wordpress execute PHP scripts when some function not defined, but if the function is preloaded this code is not executed and application doesn't work as expected)

Will it be necessary at all, or the code that tires to overload preloaded functions and classes will be ignored, so that even unmodified existing CMS's will benefit from this feature?

Inclusion of preloaded scripts should work out of the box ("redefinitions" are going to be ignored), but redefinition in other scripts will trigger errors (Cannot redeclare ...)

2. Will the opcache.preload option cause whole directories to be preloaded like the script for Zend framework that you posted above?

opcache.prelaod accepts just a PHP script. This script may implement the actual loading in a way you need.

@dstogov

This comment has been minimized.

Show comment
Hide comment
@dstogov

dstogov Oct 4, 2018

Contributor

@nikic see the next attempt based on immutable classes (first commit). Don't spend a lot of time, just give your attendance, and may be ideas about problems and improvement.

Now it makes 50 times speedup on ZF loading benchmark and ~35% on ZF1 HelloWorld (3700 req/sec vs 2700 req/sec). Although, the implementation is far from completion and there are few serious problems, now it looks mush more promising.

Contributor

dstogov commented Oct 4, 2018

@nikic see the next attempt based on immutable classes (first commit). Don't spend a lot of time, just give your attendance, and may be ideas about problems and improvement.

Now it makes 50 times speedup on ZF loading benchmark and ~35% on ZF1 HelloWorld (3700 req/sec vs 2700 req/sec). Although, the implementation is far from completion and there are few serious problems, now it looks mush more promising.

Show outdated Hide outdated ext/opcache/ZendAccelerator.c Outdated
Show outdated Hide outdated ext/opcache/ZendAccelerator.c Outdated
if (ce->num_interfaces) {
found = 1;
for (i = 0; i < ce->num_interfaces; i++) {
p = zend_hash_find_ptr(EG(class_table), ce->interface_names[i].lc_name);

This comment has been minimized.

@nikic

nikic Oct 5, 2018

Member

Does EG(class_table) also include internal classes at this point? If so, I'm wondering how this will be handled on Windows where IIRC multiple processes (with internal classes at different addrs) may attach to one shm instance.

@nikic

nikic Oct 5, 2018

Member

Does EG(class_table) also include internal classes at this point? If so, I'm wondering how this will be handled on Windows where IIRC multiple processes (with internal classes at different addrs) may attach to one shm instance.

This comment has been minimized.

@dstogov

dstogov Oct 5, 2018

Contributor

Yes. I know about this problem, but prefer to postpone it. e.g. don't support this feature on Windows or copy internal classes into SHM.

@dstogov

dstogov Oct 5, 2018

Contributor

Yes. I know about this problem, but prefer to postpone it. e.g. don't support this feature on Windows or copy internal classes into SHM.

@nikic

This comment has been minimized.

Show comment
Hide comment
@nikic

nikic Oct 5, 2018

Member

@dstogov I think the new approach looks reasonable. What I slightly dislike is that we now have two different code paths for the immutable and the non-immutable cases for runtime cache, static vars and static props. I'm wondering if it would make sense to make all functions/classes (post-link) immutable to make the way they are handled uniform. This would require making the request_data handling more dynamic (something similar to the object store: dynamic resizing + basic free list).

Member

nikic commented Oct 5, 2018

@dstogov I think the new approach looks reasonable. What I slightly dislike is that we now have two different code paths for the immutable and the non-immutable cases for runtime cache, static vars and static props. I'm wondering if it would make sense to make all functions/classes (post-link) immutable to make the way they are handled uniform. This would require making the request_data handling more dynamic (something similar to the object store: dynamic resizing + basic free list).

@dstogov

This comment has been minimized.

Show comment
Hide comment
@dstogov

dstogov Oct 5, 2018

Contributor

@nikic thanks for review. Your thoughts about non-uniform handling make sense. I'll try to solve all implementation problems first, and then return to this question. May be replace index access through CG(request_data), by indirect pointer access (through special MAP region for pointers from SHM to process memory. This is what HotSpot does)

Contributor

dstogov commented Oct 5, 2018

@nikic thanks for review. Your thoughts about non-uniform handling make sense. I'll try to solve all implementation problems first, and then return to this question. May be replace index access through CG(request_data), by indirect pointer access (through special MAP region for pointers from SHM to process memory. This is what HotSpot does)

Show outdated Hide outdated ext/opcache/ZendAccelerator.c Outdated
@zbenc

This comment has been minimized.

Show comment
Hide comment
@zbenc

zbenc Oct 10, 2018

@dstogov Thanks for working on this!
Would this enable static inlining of PHP functions within other preloaded files, e.g. helper functions from one file getting inlined at the call sites in other files?
That would be one more great benefit of preloading IMO.

zbenc commented Oct 10, 2018

@dstogov Thanks for working on this!
Would this enable static inlining of PHP functions within other preloaded files, e.g. helper functions from one file getting inlined at the call sites in other files?
That would be one more great benefit of preloading IMO.

@dstogov

This comment has been minimized.

Show comment
Hide comment
@dstogov

dstogov Oct 10, 2018

Contributor

In general, immutable classes/functions and preloading would simplify various "Whole Program Optimization" methods, including class hierarchy analysis and inner script inlining. On the other hand, dynamic nature of PHP and defined by the language life-time of variables + destructors makes a lot of troubles. Currently, I think just about preloading, optimization is the future step.

Contributor

dstogov commented Oct 10, 2018

In general, immutable classes/functions and preloading would simplify various "Whole Program Optimization" methods, including class hierarchy analysis and inner script inlining. On the other hand, dynamic nature of PHP and defined by the language life-time of variables + destructors makes a lot of troubles. Currently, I think just about preloading, optimization is the future step.

Show outdated Hide outdated ext/opcache/ZendAccelerator.c Outdated
@php-pulls

This comment has been minimized.

Show comment
Hide comment
@php-pulls

php-pulls Oct 12, 2018

Comment on behalf of petk at php.net:

Labelling

php-pulls commented Oct 12, 2018

Comment on behalf of petk at php.net:

Labelling

@dstogov

This comment has been minimized.

Show comment
Hide comment
@dstogov

dstogov Oct 12, 2018

Contributor

@nikic please take a quick look into the next iteration. It's still not finished, but now immutable classes are supported in ZTS build and on Windows. I non-ZTS build the patch makes improvement even without preloading, because immutable classes are not copied into process memory on each request.
I hope, I have to solve only few edge cases (file_cache, opcache restart, preloading in ZTS and Windows). We will also need new API instead of run-time op_array->reserved[] access of immutable functions.(I'm going to reserve space in op_array->run_time_cache).
Do you see any other problems?

Contributor

dstogov commented Oct 12, 2018

@nikic please take a quick look into the next iteration. It's still not finished, but now immutable classes are supported in ZTS build and on Windows. I non-ZTS build the patch makes improvement even without preloading, because immutable classes are not copied into process memory on each request.
I hope, I have to solve only few edge cases (file_cache, opcache restart, preloading in ZTS and Windows). We will also need new API instead of run-time op_array->reserved[] access of immutable functions.(I'm going to reserve space in op_array->run_time_cache).
Do you see any other problems?

@nikic

I looked only at the Zend part (not opcache). I think the approach makes sense and I don't see any immediate problems.

I think it would make sense to separate out the introduction of the map ptr mechanism and all the related changes and land them before the preloading functionality, as they are mostly independent and preloading will probably need more work before it's practically usable.

Show outdated Hide outdated Zend/zend.c Outdated
Show outdated Hide outdated Zend/zend_API.c Outdated
Show outdated Hide outdated Zend/zend_interfaces.c Outdated
Show outdated Hide outdated Zend/zend_closures.c Outdated
@dstogov

This comment has been minimized.

Show comment
Hide comment
@dstogov

dstogov Oct 12, 2018

Contributor

@nikic thanks for review. I completely agree, about separation of immutable+map_ptr and preloading. This is what I tried to do in first two commits. I hope, I'll able to deliver first part on next week.

Contributor

dstogov commented Oct 12, 2018

@nikic thanks for review. I completely agree, about separation of immutable+map_ptr and preloading. This is what I tried to do in first two commits. I hope, I'll able to deliver first part on next week.

dstogov added some commits Oct 16, 2018

Merge branch 'master' into immutable
* master:
  Remove unused variable makefile_am_files
  Classify object handlers are required/optional
  Add support for getting SKIP_TAGSTART and SKIP_WHITE options
  Remove some obsolete config_vars.mk occurrences
  Remove bsd_converted from .gitignore
  Remove configuration parser and scanners ignores
  Remove obsolete buildconf.stamp from .gitignore
  [ci skip] Add magicdata.patch exception to .gitignore
  Remove outdated ext/spl/examples items from .gitignore
  Remove unused test.inc in ext/iconv/tests

dstogov added some commits Oct 17, 2018

Hide offset encoding magic in ZEND_MAP_PTR_IS_OFFSET(), ZEND_MAP_PTR_…
…OFFSET2PTR() and ZEND_MAP_PTR_PTR2OFFSET() macros.
Merge branch 'immutable' into preload
* immutable:
  Added comment
  Added type cast
  Moved static class members initialization into the proper place.
  Removed redundand assertion
  Removed duplicate code
  Hide offset encoding magic in ZEND_MAP_PTR_IS_OFFSET(), ZEND_MAP_PTR_OFFSET2PTR() and ZEND_MAP_PTR_PTR2OFFSET() macros.
  typo
  Remove unused variable makefile_am_files
  Classify object handlers are required/optional
  Add support for getting SKIP_TAGSTART and SKIP_WHITE options
  Remove some obsolete config_vars.mk occurrences
  Remove bsd_converted from .gitignore
  Remove configuration parser and scanners ignores
  Remove obsolete buildconf.stamp from .gitignore
  [ci skip] Add magicdata.patch exception to .gitignore
  Remove outdated ext/spl/examples items from .gitignore
  Remove unused test.inc in ext/iconv/tests
Merge branch 'immutable' into preload
* immutable:
  Reverted back ce->iterator_funcs_ptr. Initialize ce->iterator_funcs_ptr fields in immutable classes.
Merge branch 'master' into immutable
* master:
  Remove the "auto" encoding
  Fixed bug #77025
  Add vtbls for EUC-TW encoding
Merge branch 'immutable' into preload
* immutable:
  Remove the "auto" encoding
  Fixed bug #77025
  Add vtbls for EUC-TW encoding
Merge branch 'master' into preload
* master:
  Immutable clases and op_arrays.
Merge branch 'master' into preload
* master:
  Fixed comment
  Micro optimizations
  Mark "top-level" classes
Merge branch 'master' into preload
* master:
  Mark "top-level" functions.
  Don't initialize static_member_tables during start-up, when inherit internal classes.
  [ci skip] Update NEWS
  [ci skip] Update NEWS
  [ci skip] Update NEWS
  Fix #77035: The phpize and ./configure create redundant .deps file
  Remove outdated PEAR artefacts
  Fix tests/output/bug74815.phpt generating errors.log
  Revert "Use C++ symbols, when C++11 or upper is compiled"
  Use C++ symbols, when C++11 or upper is compiled
  Added new line
  Remove stamp-h
  Move all testing docs to qa.php.net
  Fix a typo in UPGRADING.INTERNALS
  Fix test when it's run on another drive
  [ci skip] Update UPGRADING wrt. tidyp support
  Fixed incorrect reallocation
  Fix #77027: tidy::getOptDoc() not available on Windows
  Run CI tests under opcache.protect_memory=1
@php-pulls

This comment has been minimized.

Show comment
Hide comment
@php-pulls

php-pulls Oct 19, 2018

Comment on behalf of petk at php.net:

Re-labelling this masterpiece.

php-pulls commented Oct 19, 2018

Comment on behalf of petk at php.net:

Re-labelling this masterpiece.

@php-pulls php-pulls added RFC and removed Enhancement labels Oct 19, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment