Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] Preloading support #7777

Open
Toflar opened this issue Nov 7, 2018 · 60 comments

Comments

Projects
None yet
@Toflar
Copy link
Contributor

commented Nov 7, 2018

As a general preloading file seems to become a thing in PHP 7.4 (馃帀) I think we should start the discussion on how this could be implemented in Composer. I'm willing to work on a PR for that but there are a few things to be clarified first.
Here are just a few thoughts that come to my mind when thinking about the implementation:

  • Should the preloading section be handled separately from the autoload section?
  • Does it even make sense to separate them? I mean, if you dump the optimized autoloader, why would you not want to preload all classes? Are there even scenarios where you would not want to preload all of your files?
  • Does it even make sense to preload all of the files or is that hurting the performance? Should we opt for hot-path preloading which in turn would imply that autoloading and preloading indeed must be separated?
  • vendor/preload.php?
  • If preloading can be seen as an extension to the optimized autoload, how would the command look like? composer dump-autoload -o -p (-p would then also generate the preload.php)

I'm sure there's more to be considered and discussed. There are some very smart cookies in our community, so let's try to lay out the best solution together first and only then start coding 馃槉

@Seldaek

This comment has been minimized.

Copy link
Member

commented Nov 7, 2018

My first idea on the way this would work would be like the files autoloader, a new preload autoloading type. That way every package can declare file(s) that must be preloaded, and we generate a single file with all the includes. Those files can then have more smart logic on what to preload. I'd expect symfony would for example preload according to the container config and whatnot.

I don't think preloading all classes always is a good idea, that'll just blow up memory usage for no reason.

This is also very cheap to generate much like the files autoloading it's simply dumping an array of files in preload.php so it can be built always no need to be optional like optimized autoloader.

@Seldaek Seldaek added the Feature label Nov 7, 2018

@Seldaek Seldaek added this to the 2.0 milestone Nov 7, 2018

@Toflar

This comment has been minimized.

Copy link
Contributor Author

commented Nov 7, 2018

I see. That's a nice idea too!

I don't think preloading all classes always is a good idea, that'll just blow up memory usage for no reason.

I agree, it makes no sense to load all the classes if not all of them are used. However, as the developer of an app I would like to be able to optimize this myself so I don't think it makes sense to give the responsibility to define the classes that need to be preloaded to the developer of the library. Let's take the Symfony VarDumper component as an example: In 99.5% of all cases, this is used for debugging purposes only so it likely makes no sense to preload classes of it. But what if somebody builds an API service that uses this component to style some output? Preloading would make sense there then.
In other words: Whether or not preloading a class makes sense depends on how it's being used and more often than not, the developer of the library cannot tell how it's going to be used.

I also wondered how much RAM usage we're talking about. So here are some stats for everybody:

  • I reset my opcode cache. It uses 18.07 MB by default (not sure where this default usage comes from though but everytime I reset it, it was at 18.07 again so I guess it's fair to take this as starting point)
  • I ran composer create-project symfony/skeleton and called the welcome page it in prod environment (debug set to true).
  • After that the memory usage was at 19.56 MB (157 cached files).

Then I ran composer dump-autoload -o --no-dev and edited my index.php to do some simple preloading based on the autoload_classmap.php. So I really just used the _preload() function from the RFC and added this to my index.php:

$classmap = include './../vendor/composer/autoload_classmap.php';
$classmap = array_unique($classmap); // Needed because we have multiple classes in some files
_preload($classmap);

I reset the cache and visited the welcome page again: Memory usage increased to 27.74 MB (919 cached files).

So yes, memory usage does increase but is it really that significant? I mean, if you enable preloading in the first place you're after good performance, right? Is it a problem then that your app uses a fair amount of RAM constantly?

All I'm trying here is really to just throw in some numbers so we can weigh up the pros and cons.

Variant Pro's Con's
Preload classmap Super simple;
Easy to implement;
Easy to use;
Needs more RAM;
New "preload" autoload section More memory efficient; Package devs have to learn it;
No control for app dev;
Likely to miss out files on the hot path;

Maybe there are more variants? 馃槃

@Toflar

This comment has been minimized.

Copy link
Contributor Author

commented Nov 7, 2018

Here's the numbers of a bigger project using ApiPlatform, Doctrine, Guzzle, Enqueue, Symfony Translator, Redis, Symfony Console etc.):

Index page(docs endpoint) without preloading everything: 33.36 MB (15.29 MB net)
Index page(docs endpoint) with preloading everything: 75.31 (57.24 MB net)

@staabm

This comment has been minimized.

Copy link
Contributor

commented Nov 7, 2018

So you need more then double amount of ram. In other words you can serve only less then half the users in comparison to non-preloaded with the same server

@Toflar

This comment has been minimized.

Copy link
Contributor Author

commented Nov 7, 2018

Well, that's not a fair comparison. I called just one endpoint. I would need to call the other endpoints that use other components one after the other if we wanted to find out the percentage of "useless cached files". I think I'd get a lot closer to the 57.24 MB if I did that 馃槉

@staabm

This comment has been minimized.

Copy link
Contributor

commented Nov 7, 2018

Still a lot of overhead. More then I would like to pay to call the feature usefull when loading all the things

IMO loading everything only works for small apps

@Seldaek

This comment has been minimized.

Copy link
Member

commented Nov 7, 2018

I appreciate your enthusiasm, buuut I don't think any of this is composer's responsibility. I fully agree that it's most likely app specific what you want to preload, but there are two things to consider in my proposal:

  • the preload autoload doesn't have to be defined by every package. Typically I'd expect symfony to define that, or maybe some crypto lib that highly benefits from preloading assuming we get JIT optimizations in there.
  • you don't have to point your php.ini to the preload.php file that composer generates. If you want a custom one, build a custom one. Frameworks definitely should offer tools for that.

Lastly regarding the option of preloading the whole classmap, that's a 3-line foreach loop that you can write as your own preload script loading all classes from composer's classmap if you are so inclined to waste memory ;) It could also be offered as a package toflar/preload-all-the-things that has a preload autoload which then goes and includes all files from the classmap.

So my position at least for now is to either do some very simple thing to facilitate things and "standardize" on a vendor/preload.php file, or alternatively we do nothing at all.

@stof

This comment has been minimized.

Copy link
Contributor

commented Nov 7, 2018

I tend to agree with @Seldaek here.

The preloading-everything strategy is already straightforward once you generate a full classmap for the optimized autoloader (which you should do anyway if you care about performance, and you don't need preloading if you don't care).
Any smarter algorithm would have to rely on the structure of the project, and so it might be hard to deal with that in Composer.
In Symfony, it would probably make no sense to have a config in the package deciding which Symfony files are preloaded for all Symfony projects. But Symfony could be generating a preload files for projects using it (and so with class coming from Symfony, but also from other dependencies or from the project itself) based on some heuristic. this is being suggested in symfony/symfony#29105 (comment) (I took the example of Symfony here, because I know about the current state of the discussion here, but the same could apply for other frameworks of course).

@Toflar

This comment has been minimized.

Copy link
Contributor Author

commented Nov 7, 2018

BTW: I'm not inclined to waste memory at all. It was just one variant that came to my mind and so I elaborated on it. I think it's good to consider multiple approaches, also for the people that read the issue later on. I'm perfectly fine with having the bad ones ruled out 馃槃

We could allow the preload section only in the root composer.json (so project specific) and that's where you specify the files you like to be included. Whether or not you use other files that are dynamically generated by e.g. Symfony is your business then. But the important feature here would be Composer that lets you aggregate out of the box (and maybe we can get the default php.ini setting to be set to vendor/preload.php in php itself which would just be ignored if it doesn't exist 馃槃).

@stof

This comment has been minimized.

Copy link
Contributor

commented Nov 7, 2018

@Toflar if it is root-only, why asking to put them in composer.json so that composer requires them in another file that you can then reference in your php.ini ? You could reference your own file directly in the php.ini.

@Toflar

This comment has been minimized.

Copy link
Contributor Author

commented Nov 7, 2018

I know 馃槃 The only thing it would do is somewhat "standardizing" the way to do it. Nothing else 馃槉

@aenglander

This comment has been minimized.

Copy link

commented Nov 7, 2018

I would think that creating levels for autoloading like logging levels would allow for some sort of "automated" control with the library providing the files/classes that would be appropriate for the level. An uber level like "all" could include all files including the files for libraries not supporting the levels. Sounds like something that might fit into the PHP-FIG realm of discussion as well.

@Crell

This comment has been minimized.

Copy link

commented Nov 7, 2018

Some disorganized thoughts:

  1. A way to whitelist files to preload needs to include regex support; in practice, most significant applications load hundreds of classes on every request. I'd rather eat the cost of preloading a few more than I need than having to list them all out manually.

  2. One important question is that, AIUI, the increased memory usage is shared, isn't it? Viz, if we preload 50 MB worth of classes, does that increase the memory usage of every request by 50 MB or does it increase the base cost of having FPM processes by 50 MB but net reduce the memory per request? (I honestly don't know here, but it's a distinction worth verifying.)

  3. Since Composer is basically the universal autoloader at this point, is there a way that Composer can assist in determining what good candidates are to preload? I'm not entirely sure how it would do that without writing data to disk, which is probably undesireable, but if we're going to say "site owners, this is your job" we should try to give them enough information to make that job really easy. What they're going to want is a list of the X most loaded classes, or the classes used on more than Y% of requests, or something like that. Is that something Composer can help compute, and if not, what would?

  4. While I'm sympathetic to the simplicity of "preload all the things", there are some files that MUST NOT be preloaded. For instance, at Platform.sh we use a composer-loaded file (not class) to execute code before the application initializes. That lets us map host-provided environment variables into application-expected environment variables. The process works pretty well but that code needs to be run on every request, so the file has non-symbol-definition code (viz, it violates PSR-1), and so preloading it would break things. So it's probably useful for any auto-preload-builder thingie (Composer or otherwise) to include a way to let packages blacklist certain files that should never be preloaded.

  5. I don't really see a place for FIG here; What would be in scope for FIG would be "hey packages, here's how you expose your preload info". But really, even if we decide that's a package's job to do (and it may be), 99.999947% of the time that will be via composer.json, which is out of FIG's purview.

@pmmaga

This comment has been minimized.

Copy link

commented Nov 7, 2018

I totally agree that preload should be a separate section from autoload as they are different concepts. For one, you can use an autoloader in your preload script:

<?php

spl_autoload_register(function($name) {
    include_once("$name.php");
});

use Foo;

About what should be preloaded and not, keep in mind that to refresh the preloaded files you must restart php. I think the most typical use-case would be to preload your vendor but keep your application out of it so you can deploy changes to your application without a server restart.

@Toflar

This comment has been minimized.

Copy link
Contributor Author

commented Nov 8, 2018

One important question is that, AIUI, the increased memory usage is shared, isn't it? Viz, if we preload 50 MB worth of classes, does that increase the memory usage of every request by 50 MB or does it increase the base cost of having FPM processes by 50 MB but net reduce the memory per request? (I honestly don't know here, but it's a distinction worth verifying.)

That's the same question I was asking myself. Because I understand it the way that you preload stuff that would be loaded into memory later on anyway. Just not on every request but shared. Which in turn means that "preloading all the things" doesn't effectively change the amount of memory used except for the percentage of classes that are not needed aka classes that are shipped with a library but never used within the context of the app.
But not quite sure either but very important to know, indeed.

@stof

This comment has been minimized.

Copy link
Contributor

commented Nov 8, 2018

@Toflar your response seems to assume that all your requests are using all the classes from the codebase (so used by the project is the same than used by the request). For most projects, that's not the case. Many classes are used only by some specific requests rather than by all of them.

@teohhanhui

This comment has been minimized.

Copy link

commented Nov 8, 2018

If Composer does decide to do something, it should be root-only. Letting your dependencies decide what to preload is not helpful.

@BenMorel

This comment has been minimized.

Copy link

commented Nov 14, 2018

One important question is that, AIUI, the increased memory usage is shared, isn't it? Viz, if we preload 50 MB worth of classes, does that increase the memory usage of every request by 50 MB or does it increase the base cost of having FPM processes by 50 MB but net reduce the memory per request? (I honestly don't know here, but it's a distinction worth verifying.)

Like @Crell and @Toflar, I was asking myself the same thing. I highly doubt that the memory used by preloading classes is copied to every single PHP process, I guess it's shared memory (to be verified), so preloading the whole stuff would "only" eat ~100MB, but every PHP process thereafter maybe eats less memory?

I'm personally in favour of having composer.json generate a preload.php script that contains by default all the files from the preload section of the project itself and the vendor dependencies.

I don't mind if there's a way to exclude some files explicitly, though, if these files are most likely never used in the average project. But I'm not sure whether any such file would actually belong to the autoload section then, they would probably be dev classes used in autoload-dev only?

While I'm sympathetic to the simplicity of "preload all the things", there are some files that MUST NOT be preloaded. For instance, at Platform.sh we use a composer-loaded file (not class) to execute code before the application initializes. That lets us map host-provided environment variables into application-expected environment variables. The process works pretty well but that code needs to be run on every request, so the file has non-symbol-definition code (viz, it violates PSR-1), and so preloading it would break things. So it's probably useful for any auto-preload-builder thinill gie (Composer or otherwise) to include a way to let packages blacklist certain files that should never be preloaded.

As far as I understand it:

  • you don't have to execute the preloaded files, you can just opcache_compile_file() them; I would suggest that the preload.php file is just a big list of opcache_compile_file() statements;
  • To be verified: I wouldn't expect the code in the preloaded files to not execute. As I understand it, class definitions in such files are cached, but if you explicitly include a preloaded file, I think the code besides the classes will execute anyway.

The only issue I can think of, for your use case, is that class_exists() will return true because the file has been preloaded, so this might not trigger the autoloader; you would therefore have to include() the file explicitly. I would be curious to see what your motivation is for mixing class declarations with other code, though: should this鈥攗sually not recommended鈥攁pproach prevent composer from doing the right thing for most other users? No offense here, but I think that composer should aim to support out of the box the recommended approach, and maybe your slightly exotic approach should require a custom preloading script?

@Toflar your response seems to assume that all your requests are using all the classes from the codebase (so used by the project is the same than used by the request). For most projects, that's not the case. Many classes are used only by some specific requests rather than by all of them.

Then preloading is not for these projects, as explicitly mentioned in the RFC:

And also, this approach will not be compatible with servers that host multiple applications, or multiple versions of applications - that would have different implementations for certain classes with the same name - if such classes are preloaded from the codebase of one app, it will conflict with loading the different class implementation from the other app(s).


As such, here is how I would implement preload.php (my 2 cents):

<?php

// list here all the files that are generated by the optimized autoloader

opcache_compile_file(...);
opcache_compile_file(...);
opcache_compile_file(...);
...

This would be the kind of file that would be generated with no extra configuration. It's there, you can use it if you want, but you don't have to.

Optionally, I would add a preload-exclude or equivalent composer.json option, that allows to exclude individual files, or entire directories, or even vendor dependencies.

For those having multiple versions of an application / dependency: to reiterate, I would advocate to not use preloading at all. Or if you're feeling brave enough, fiddle with preload-exclude or write your own preload script.

For all other users out there, just including the auto-generated preload.php will work like magic.

@Toflar

This comment has been minimized.

Copy link
Contributor Author

commented Nov 14, 2018

Maybe we can get @dstogov to help us out to make the right decision for the PHP community here 馃槉

@teohhanhui

This comment has been minimized.

Copy link

commented Nov 14, 2018

@BenMorel:

@Toflar your response seems to assume that all your requests are using all the classes from the codebase (so used by the project is the same than used by the request). For most projects, that's not the case. Many classes are used only by some specific requests rather than by all of them.

Then preloading is not for these projects, as explicitly mentioned in the RFC:

And also, this approach will not be compatible with servers that host multiple applications, or multiple versions of applications - that would have different implementations for certain classes with the same name - if such classes are preloaded from the codebase of one app, it will conflict with loading the different class implementation from the other app(s).

You've misunderstood @Toflar's comment about some classes only being used in some requests of the application vs multiple applications / multiple versions of applications. They're not the same thing at all.

@BenMorel

This comment has been minimized.

Copy link

commented Nov 14, 2018

@teohhanhui I don't think I misunderstood @Toflar's comment, I was quoting @stof who was himself replying to @Toflar.

To clarify my thoughts:

  • only some classes being used in some requests of the application:
    the project would still benefit from preloading all classes (unless memory is an issue, which I highly doubt)
  • multiple applications with different versions of the same dependency / multiple versions of applications:
    I would not use preloading at all, or leave it to the user to create their own preloading script if they know what they're doing.
@dstogov

This comment has been minimized.

Copy link

commented Nov 14, 2018

Maybe we can get @dstogov to help us out to make the right decision for the PHP community here blush

I think, preloading is a very new feature to immediately implement its support in composer.
Adaptation of applications and frameworks for preloading should identify best solutions, missing functionality, etc.

I tried preloading the whole frameworks (ZendFramework) and application specific preloading (getting the list of used PHP scripts through opcache_get_status() and generating a list of opcache_compile_file(...)). The second approach works better.

Read about usage of Java Class Data Sharing. We implemented similar technology, and may borrow use cases.

@BenMorel

This comment has been minimized.

Copy link

commented Nov 14, 2018

I tried preloading the whole frameworks (ZendFramework) and application specific preloading (getting the list of used PHP scripts through opcache_get_status() and generating a list of opcache_compile_file(...)). The second approach works better.

Thanks for jumping in, @dstogov!

Could you please explain what you mean by works better? Did you get better performance? Did preloading all the classes take up too much memory? Did you have any issue?

@Crell

This comment has been minimized.

Copy link

commented Nov 14, 2018

@dstogov Can you clarify the question above regarding the memory usage of a preloaded class? Viz, if I have 100 classes that are used on virtually every request anyway, and I then preload all of them, we know that's going to save CPU time. However, is it going to increase, decrease, or have no effect on memory usage?

Similarly, if we preload 10 MB worth of code that is used only on a small fraction of requests, and there are 10 concurrent requests, have we now increased total memory usage by 10 MB (shared memory) or 100 MB (cost in each process)?

@Crell

This comment has been minimized.

Copy link

commented Nov 14, 2018

@BenMorel The example from Platform.sh is not a class at all. It's a file that looks something like this:

<?php

function stuff() {
  $_ENV['db_name'] = $_ENV['dbname'];
}

stuff();

(Because the application wants an environment variable with one name and our system by default provides it with another. This is a very over-simplified example but it gets the idea across.)

That file is then included by composer so it runs during autoload, before the application looks for its environment variables. There's nothing intrinsically wrong with that approach and it works quite well right now. My point is that such a file MUST NOT be preloaded, because then it won't actually run on subsequent requests and break the application. We don't need to do anything special here for it other than make sure that it doesn't get picked up and preloaded accidentally by whatever mechanism Composer ends up using.

@Seldaek

This comment has been minimized.

Copy link
Member

commented Nov 15, 2018

@Crell preloading such a file would have zero effect the way I understand it. As it doesn't declare a preloadable class it is ignored, and will be executed at runtime when included.

@BenMorel

This comment has been minimized.

Copy link

commented Nov 19, 2018

@er1z

How should the Composer predict what set of classes has to be warmed-up?

It can't. That's precisely why I am advocating that Composer generates a catch-all preloader. I am confident that it should be faster than not using preloading (unless a benchmark proves me wrong, of course).

If you know how to fine-tune your preloading (for example using @dstogov's suggested approach above: "getting the list of used PHP scripts through opcache_get_status() and generating a list of opcache_compile_file(...)"), then you're free to not use the preload.php file offered by Composer, and use an alternate approach.

Now, I see the "preload everything" approach as a default only: nothing prevents Composer from offering a way to specify which dependencies or directories to preload!

@er1z

This comment has been minimized.

Copy link

commented Nov 19, 2018

So why waste resources by preloading all from package?

@BenMorel

This comment has been minimized.

Copy link

commented Nov 19, 2018

@er1z To reiterate: this would just be an optional default: if you want to speed up your app with zero configuration, then you could use the default preload.php file, which is better than nothing, isn't it?

If you know what classes/dependencies you use most often, then you could tell composer to preload only those, should composer provide such a configuration option.

If you can actually profile your application and generate a preload PHP script from opcache_get_status(), then do so. I think a third-party script that does this would be nice, but I don't think this belongs to composer.

If you don't want to use the default preloader, then don't. Of course there will be some waste of memory (and maybe a few CPU cycles) by using a non-optimized preloader, but if the end result is better performance anyway, and less memory used per request, then why not?

@BenMorel

This comment has been minimized.

Copy link

commented Nov 20, 2018

I ran a benchmark with a medium-size project of mine (90 package dependencies), that includes on every request a quite heavy bootstrap script that sets up classes for dependency injection; this makes it a good candidate for benchmarking class loading in real life conditions.

I benchmarked a simple page of the website, that makes very little use of the database, but alone triggers the autoloading of 380 classes. I tested the following configurations:

No preloading

I used composer's optimized autoloader: composer install --optimize-autoloader. The opcache was warmed up by loading the page manually.

Preloading only "hot" classes

I restarted the server, loaded enough pages from the website, then used this script to generate a preload file from cached files, as reported by opcache:

<?php

header('Content-Type: text/plain');

echo '<?php', PHP_EOL;

$status = opcache_get_status(true);

foreach ($status['scripts'] as $script) {
        $path = $script['full_path'];
        echo 'opcache_compile_file(', var_export($path, true), ');', PHP_EOL;
}

This script preloads 878 files.

Preloading all the classes

I used the following preload script, that preloads the whole composer classmap:

<?php

$files = require 'vendor/composer/autoload_classmap.php';

foreach (array_unique($files) as $file) {
    opcache_compile_file($file);
}

This script preloads 14,541 files.

Results

Benchmark Preloaded files Server startup time Opcache memory used Per request memory used Requests per second
No preloading 0 0.06 s 16 MB after warmup 1,825 KB 596 rq/s
Preload hot classes 878 0.26 s 21 MB 869 KB 695 rq/s
Preload everything 14541 1.56 s 105 MB 881 KB 675 rq/s

We can see interesting performance benefits whenever we use preloading:

  • + 13% when preloading everything
  • + 16% when preloading only hot classes

As predicted by @dstogov, execution is a bit faster when preloading only the classes used by a given project.

Server startup time is a non-issue to me, and the overhead of preloading everything vs preloading hot classes is only 84 MB here, so negligible on modern hardware.

What's interesting also, is that using preloading (everything or only hot classes, it doesn't matter) halved the memory consumption per request!

Wrapping up

  • preloading everything is not a big deal, and can be done with zero configuration;
  • preloading only hot classes is a bit more optimized, and is worth it if you can take the time to warm up your opcache enough to cover as many of the classes as possible, before running the preloader generator script.

Looking at the above scripts though, they're so trivial that I'm starting to wonder if Composer should provide a preload script at all. Just copy/paste the above and you're done.


Benchmark info:

  • VM with 4 cores, 4GB RAM on my i7 laptop with NVMe SSD
  • Fedora 29, latest PHP master compiled from source
  • Apache + PHP-FPM
  • opcache.memory_consumption=1000
  • opcache.max_accelerated_files=20000

Benchmarks have been run with Apache Bench, 1000 requests with 10 concurrent requests. All benchmarks have been run 50 times and the best result was used.

@zsuraski

This comment has been minimized.

Copy link

commented Nov 20, 2018

These results look awesome.

I do think that native Composer support makes sense (already guesstimated that there was going to be one in my ZendCon and PHP Ruhr presentations that covered this!), but there are several things to take into account:

  1. This is really suitable only in the case of a single-app-per-server scenario, as different apps may bump into one another with conflicting dependencies - something you generally don't need to worry about without preloading. I don't believe Composer ever touches php.ini, so this shouldn't be an issue - as the last step of actually explicitly placing the preload.php file into php.ini would be a manual step that the user would have to be do proactively, but if Composer does ever update php.ini, it shouldn't automatically enable preloading under any circumstances.

  2. As Dmitry mentioned, there are code patterns that can behave differently when preloaded - so far we've recognized function_exists(), and there may be other reflection-based code patterns that execute differently depending on what's loaded and what isn't, that could end up executing differently when preloaded vs. not.

All in all, if Composer simply provides a preload script that simply preloads all of the relevant classes as a convenience - and let's the user pull the trigger on actually using it - I think that would do the job nicely. It would probably be a good idea to include some comments at the top of that file (how to set it up, the fact it's for PHP >7.4, caveats to watch for, etc.).

BTW, on the benchmark, I would recommend running it for a slightly longer period of time. If I understand the stats correctly the benchmarks ran for just over a second (1000 requests at around 700 req/sec), which can typically end in not-so-stable results. Perhaps run with "-t 10" to see how many requests are squeezed in 10 seconds.

Thanks!

@Toflar

This comment has been minimized.

Copy link
Contributor Author

commented Nov 20, 2018

@BenMorel thanks for the benchmarks! You should not run composer install --optimize-autoloader but composer install --optimize-autoloader --no-dev though. Otherwise you will also preload all require-dev autoload files 馃槃 So your benchmark should be updated there.

@zsuraski

This comment has been minimized.

Copy link

commented Nov 20, 2018

One other thing I wanted to comment on re: memory consumption - preloading files actually doesn't end up consuming significantly more shared memory than it would take to regularly include()/require() them and store them in opcache. It will take a bit more - as in some cases we'd be able to resolve a bit more inter-class dependencies during this stage and store the meta data in shared memory - but this should be negligible, as the main shared memory consumer are the actual opcodes (bytecodes). And of course, this is shared-memory-well-spent - as it saves both time and per-process memory later on during the request runtime.

Since most apps easily fit into several tens or hundreds of megabytes - which is really nothing on a server-wide basis (everything is shared), and since memory consumption per-process goes down significantly (and this is actually a lot more important, as it results in things like memory fragmentation) - I would think that's it better to err on the side of preloading too much than preloading too little.

Perhaps there can be a check that would alert the user in case the opcache shared memory size is inadequate for the amount of preloaded files? This can probably be implemented in the auto generated preload.php file, along with a recommendation on how to fix it - which should be very simple & cheap in most cases.

My 2c.

UPDATE: Dmitry already added this check & message in case the opcache runs out of memory/files right into the preloading implementation. So the userland implementation can be naive and just try to load everything.

@BenMorel

This comment has been minimized.

Copy link

commented Nov 20, 2018

These results look awesome.

They do, and even without preloading, considering the amount of work done (380 classes loaded and linked in real time, + running the actual code), it's incredible to be able to get 600 req/s on commodity hardware! Preloading is the cherry on the cake (before we get JIT 馃槈)!

All in all, if Composer simply provides a preload script that simply preloads all of the relevant classes as a convenience - and let's the user pull the trigger on actually using it - I think that would do the job nicely.

That's what I've been advocating so far, but I think we needed a benchmark to be convinced! Anyway, the end user is free to include this file in their php.ini, or run their own preload script.

BTW, on the benchmark, I would recommend running it for a slightly longer period of time.

You should not run composer install --optimize-autoloader but composer install --optimize-autoloader --no-dev though.

Good point to both of you! I ran again the benchmarks with --no-dev and -t 10. The number of files hasn't changed much, strangely: down to 14096. Here are the (even better) results:

Benchmark Requests per second Diff
No preloading 631 rq/s -
Preload hot classes 738 rq/s +17%
Preload everything 712 rq/s +13%

The diff has only changed by a few decimal points, though!

Perhaps there can be a check that would alert the user in case the opcache shared memory size is inadequate for the amount of preloaded files?
UPDATE: Dmitry already added this check & message in case the opcache runs out of memory/files right into the preloading implementation. So the userland implementation can be naive and just try to load everything.

Exactly, here is the error message when starting the server with a too low opcache.memory_consumption:

Fatal Error Not enough shared memory for preloading!

@zsuraski

This comment has been minimized.

Copy link

commented Nov 20, 2018

Exactly, here is the error message when starting the server with a too low opcache.memory_consumption:

Fatal Error Not enough shared memory for preloading!

It will be slightly more informative from now on, and point people to consider increasing the opcache.memory_consumption or opcache.max_accelerated_files (accordingly).

@Ayesh

This comment has been minimized.

Copy link
Contributor

commented Nov 20, 2018

I put together a small and very rudimentary Composer plugin to generate a vendor/preload.php file from a given set of paths. I'd appreciate any feedback if you'd like to try it out.

https://github.com/Ayesh/Composer-Preload

@barryvdh

This comment has been minimized.

Copy link
Contributor

commented Jul 4, 2019

But if the performance is actually better by just looking at the hot paths (opcache cached scripts after running multiple requests), does it even make sense to let Composer handle this? How would it be possible to determine what projects are actually 'hot'? Because if I require Guzzle, I might only want to use it on 1 of 1.000 requests, so doesn't make much sense to always preload it. But if I use it every single request, it does..

@BenMorel

This comment has been minimized.

Copy link

commented Jul 4, 2019

@barryvdh I've personally given my opinion on this, I'll summarize it here:

  1. if anything, I think Composer should provide a preload.php script alongside autoload.php, that would be as simple as:

    <?php
    
    $files = require __DIR__ . '/composer/autoload_classmap.php';
    
    foreach (array_unique($files) as $file) {
        opcache_compile_file($file);
    }
    
  2. I don't think Composer can do a good job at generating an "optimized" preload script, that requires, as you said, analyzing hot paths in the actual application.

At least Composer would come with something that would provide a nice performance boost out-of-the-box. If people want to improve performance even more (by a few %), they can generate a preload script using a script like the one I used above, but I don't think such a script belongs to Composer.

Anyway it's so trivial that I wouldn't mind if Composer did nothing about preloading.

If I was in charge of the project though, I would still go for point 1. Quick, easy, efficient enough.

@DarkGhostHunter

This comment has been minimized.

Copy link

commented Jul 4, 2019

How could Composer analyse the application to get a list of hot classes? I think that for a well-done hot classes list it should go through heuristics or something so advanced that would out of the scope of Composer itself. Hot classes vary project to project, and the developer should be responsible for.

I can see that Frameworks like CakePHP, Laravel, Symfony and others could benefit because they have classes that always load, and offer to Composer a list of these through a preload.php.

Another solution would be to have a package totally apart that could identify a list of classes being hit when a script runs, like during a standard request-response lifecycle, and then stop and return that list, leaving the developer to consider which of all need to be included or not.

Aside from all that, Composer should offer a key to manage the preloading. Then the developer could choose to include/exclude the preload of certain pacakges through the root composer.json. So, if a package includes a huge list of classes that you barely use, you could exclude it, of override it.

It would be also cool to check how much memory a preloading could take.

But anyway, Composer stay away of logic to decide what is hot and what is not.

@DarkGhostHunter

This comment has been minimized.

Copy link

commented Jul 4, 2019

It would be handy if Composer included a key to manage how to include the preloading (in production and on development):

  • No key, or empty key, would be no preload.
  • autoload would just preload all the autoloading classes. You could exclude certain namespaces or classes.
  • file would take preload scripts located somewhere with your own logic, like taking the autoloading class and remove certain classes, or add some of your own, whatever. The will be additive, meaning, if a class repeats itself it wouldn't matter since composer would just skip it.

The preload only is taken from the root composer.json. If a package has the key, good for them.

{
    "preload": "autoload",
    "preload-dev": {
        "files": [
            "my-app/preload.php",
            "heuristic/preload.php"
        ]
    }
}

The reason why I'm against of including this script (cool idea btw) is that means using opcache. Since Composer lives outside the application lifecycle and PHP itself, a package is needed to create a helpful preloading through analysis.

There, my two cents.

@NiZerin

This comment has been minimized.

Copy link

commented Jul 9, 2019

When will PHP 7.4 hotload be fully supported

@garygreen

This comment has been minimized.

Copy link

commented Jul 10, 2019

How could Composer analyse the application to get a list of hot classes?

Most sites have opcache enabled which keeps statistics about the most used classes in the application. See @BenMorel script as above.

@zsuraski

You mention:

I would think that's it better to err on the side of preloading too much than preloading too little.

Could you elaborate on this? Are you suggesting if too little was loaded then the application could potentially not work as it has missing dependencies/linked information?

As projects can vary massively in how many packages they have installed I would personally opt to preload only "hot" files by default. There's many files in your project that will likely never get touched.

For example, we have installed aws-sdk-php this is required by league/flysystem-aws-s3-v3 and the project has over 1000+ PHP files which according to my apps opcache aren't ever cached (probably because we only use a tiny fraction of this package and only during a weekly cron task).

I'm personally not in favour of this "blind" approach to preloading everything in vendor it seems like a poor overall strategy, especially if the analytics point towards hot loading being more efficient overall.

@DarkGhostHunter

This comment has been minimized.

Copy link

commented Jul 11, 2019

How could Composer analyse the application to get a list of hot classes?

Most sites have opcache enabled which keeps statistics about the most used classes in the application. See @BenMorel script as above.

@zsuraski

You mention:

I would think that's it better to err on the side of preloading too much than preloading too little.

Could you elaborate on this? Are you suggesting if too little was loaded then the application could potentially not work as it has missing dependencies/linked information?

As projects can vary massively in how many packages they have installed I would personally opt to preload only "hot" files by default. There's many files in your project that will likely never get touched.

For example, we have installed aws-sdk-php this is required by league/flysystem-aws-s3-v3 and the project has over 1000+ PHP files which according to my apps opcache aren't ever cached (probably because we only use a tiny fraction of this package and only during a weekly cron task).

I'm personally not in favour of this "blind" approach to preloading everything in vendor it seems like a poor overall strategy, especially if the analytics point towards hot loading being more efficient overall.

I always look into packages and features having two ways of configuration: hands-off and manual.

Judging by the analytics, the hands-off should let Composer take the most used classes in the application and preload them until a certain MB threshold (32~128MB by default seems good for an standard application). Priority of the classes would be the ones with more hits, and it will leave out those will less hits.

On manual, though, Composer should get an script that returns an array of Classes or Namespaces to preload.

The latter could be also good on production environments, since you could use a predetermined list of classes and namespaces to test the performance.

@Ayesh

This comment has been minimized.

Copy link
Contributor

commented Jul 11, 2019

From what I see, the opcache has separate caches and statistics for fpm and cli, which makes any forced preload, statistics, or clean actions probably won't have the same effect in a web app context. I worked on the very same feature on my Composer-Preload plugin, but I couldn't figure out how to bypass the separate bins Opcache has for fpm and cli.

@DarkGhostHunter

This comment has been minimized.

Copy link

commented Jul 11, 2019

@rask

This comment has been minimized.

Copy link

commented Jul 16, 2019

Has anyone considered the memory usage on embedded systems where you actually do not have gigs and gigs of RAM available? If I want to host a small web server on some system that offers a humongous amount of 64 megabytes of RAM, I can preload like a dozen files and PHP will crap out if Composer has decided that "all in" is the best overall preload strategy?

(Somewhat exaggerated example, yes.)

I would say having a preload script generated only from the root project composer.json definition, and only with user-defined files is the best option, if Composer is to be used for preload file generation at all.

@DarkGhostHunter

This comment has been minimized.

Copy link

commented Jul 16, 2019

@BenMorel

This comment has been minimized.

Copy link

commented Jul 16, 2019

Guys, preloading requires a ini setting, so Composer won't stab you in the back either way, don't worry ;-)

@Crell

This comment has been minimized.

Copy link

commented Jul 19, 2019

I am highly skeptical that Composer is the right place to do hot-path analysis to determine the optimal set of files to preload. Activating preload requires an ini setting as @BenMorel notes, so really all Composer would be able to do is generate a script that preloads everything (which you can then use or not) or allow libraries to declare "preload these files", and then generate a script that preloads just those.

Anything more complex would be, I think, way out of scope for Composer itself. (Maybe an extension?) Let's not give poor Jordi and Nils a task of implementing autoload machine learning, k? 馃槃

@DarkGhostHunter

This comment has been minimized.

Copy link

commented Jul 20, 2019

I am highly skeptical that Composer is the right place to do hot-path analysis to determine the optimal set of files to preload. Activating preload requires an ini setting as @BenMorel notes, so really all Composer would be able to do is generate a script that preloads everything (which you can then use or not) or allow libraries to declare "preload these files", and then generate a script that preloads just those.

Anything more complex would be, I think, way out of scope for Composer itself. (Maybe an extension?) Let's not give poor Jordi and Nils a task of implementing autoload machine learning, k? 馃槃

If that so, there should be no problem to add a composer preload to generate a list to be preloaded as preload.php. Libraries could declare which files to declare as "preloadable" under a key.

{
    "preload": {
        "psr-4": [
            "ServerUtils\\PingTools\\",
            "ServerUtils\\TracertManager"
        ],
        "files": [
            "helpers.php"
        ]
    }
}

If a library isn't preloadable, the developer can "add" a library to the autogenerated list using the proyect roots composer.json:

{
    "preload": {
        "psr-4": [
            "OldPackageNotPreloaded\\",
            "OtherNotPreloaded\\CommonClass"
        ],
        "files": [
            "my-app-helpers.php"
        ]
    }
}

The developer can use this list for development for a quickstart, since vendor files shouldn't be risky to preload since these won't change. On production, the developer should ask OPCache for analytics and preload the most performance list of classes for his application.

Should be that enough? Composer will only help to make a preliminary preload list, but the final decision is sill in the developers' hand.

@teohhanhui

This comment has been minimized.

Copy link

commented Jul 20, 2019

@DarkGhostHunter As have been pointed out multiple times in this thread, just preload everything. No need for such unnecessary configurations that will most likely end up in preloading too little anyway, thereby defeating the whole purpose.

@Ayesh

This comment has been minimized.

Copy link
Contributor

commented Jul 20, 2019

I also agree with what @Crell said about making composer determine the autoload files. This can easily increase the complexity of Composer and should not be part of a dependency manager.

The plugin I shamelessly self-plugged is also trying to generate a preload.php file based on the composer.json directives. With current Opcache statistics limitations, I don't think it's even possible to reliably list and stat opcache files across different SAPIs (cli to fpm, etc).

@DarkGhostHunter

This comment has been minimized.

Copy link

commented Jul 20, 2019

@DarkGhostHunter As have been pointed out multiple times in this thread, just preload everything. No need for such unnecessary configurations that will most likely end up in preloading too little anyway, thereby defeating the whole purpose.

On a development machine it could work since there are no high RAM limitations like in a production machine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can鈥檛 perform that action at this time.