Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] Preloading support #7777

Open
Toflar opened this issue Nov 7, 2018 · 77 comments
Open

[RFC] Preloading support #7777

Toflar opened this issue Nov 7, 2018 · 77 comments

Comments

@Toflar
Copy link
Contributor

@Toflar Toflar commented Nov 7, 2018

As a general preloading file seems to become a thing in PHP 7.4 (馃帀) I think we should start the discussion on how this could be implemented in Composer. I'm willing to work on a PR for that but there are a few things to be clarified first.
Here are just a few thoughts that come to my mind when thinking about the implementation:

  • Should the preloading section be handled separately from the autoload section?
  • Does it even make sense to separate them? I mean, if you dump the optimized autoloader, why would you not want to preload all classes? Are there even scenarios where you would not want to preload all of your files?
  • Does it even make sense to preload all of the files or is that hurting the performance? Should we opt for hot-path preloading which in turn would imply that autoloading and preloading indeed must be separated?
  • vendor/preload.php?
  • If preloading can be seen as an extension to the optimized autoload, how would the command look like? composer dump-autoload -o -p (-p would then also generate the preload.php)

I'm sure there's more to be considered and discussed. There are some very smart cookies in our community, so let's try to lay out the best solution together first and only then start coding 馃槉

@Seldaek
Copy link
Member

@Seldaek Seldaek commented Nov 7, 2018

My first idea on the way this would work would be like the files autoloader, a new preload autoloading type. That way every package can declare file(s) that must be preloaded, and we generate a single file with all the includes. Those files can then have more smart logic on what to preload. I'd expect symfony would for example preload according to the container config and whatnot.

I don't think preloading all classes always is a good idea, that'll just blow up memory usage for no reason.

This is also very cheap to generate much like the files autoloading it's simply dumping an array of files in preload.php so it can be built always no need to be optional like optimized autoloader.

@Seldaek Seldaek added the Feature label Nov 7, 2018
@Seldaek Seldaek added this to the 2.0 milestone Nov 7, 2018
@Toflar
Copy link
Contributor Author

@Toflar Toflar commented Nov 7, 2018

I see. That's a nice idea too!

I don't think preloading all classes always is a good idea, that'll just blow up memory usage for no reason.

I agree, it makes no sense to load all the classes if not all of them are used. However, as the developer of an app I would like to be able to optimize this myself so I don't think it makes sense to give the responsibility to define the classes that need to be preloaded to the developer of the library. Let's take the Symfony VarDumper component as an example: In 99.5% of all cases, this is used for debugging purposes only so it likely makes no sense to preload classes of it. But what if somebody builds an API service that uses this component to style some output? Preloading would make sense there then.
In other words: Whether or not preloading a class makes sense depends on how it's being used and more often than not, the developer of the library cannot tell how it's going to be used.

I also wondered how much RAM usage we're talking about. So here are some stats for everybody:

  • I reset my opcode cache. It uses 18.07 MB by default (not sure where this default usage comes from though but everytime I reset it, it was at 18.07 again so I guess it's fair to take this as starting point)
  • I ran composer create-project symfony/skeleton and called the welcome page it in prod environment (debug set to true).
  • After that the memory usage was at 19.56 MB (157 cached files).

Then I ran composer dump-autoload -o --no-dev and edited my index.php to do some simple preloading based on the autoload_classmap.php. So I really just used the _preload() function from the RFC and added this to my index.php:

$classmap = include './../vendor/composer/autoload_classmap.php';
$classmap = array_unique($classmap); // Needed because we have multiple classes in some files
_preload($classmap);

I reset the cache and visited the welcome page again: Memory usage increased to 27.74 MB (919 cached files).

So yes, memory usage does increase but is it really that significant? I mean, if you enable preloading in the first place you're after good performance, right? Is it a problem then that your app uses a fair amount of RAM constantly?

All I'm trying here is really to just throw in some numbers so we can weigh up the pros and cons.

Variant Pro's Con's
Preload classmap Super simple;
Easy to implement;
Easy to use;
Needs more RAM;
New "preload" autoload section More memory efficient; Package devs have to learn it;
No control for app dev;
Likely to miss out files on the hot path;

Maybe there are more variants? 馃槃

@Toflar
Copy link
Contributor Author

@Toflar Toflar commented Nov 7, 2018

Here's the numbers of a bigger project using ApiPlatform, Doctrine, Guzzle, Enqueue, Symfony Translator, Redis, Symfony Console etc.):

Index page(docs endpoint) without preloading everything: 33.36 MB (15.29 MB net)
Index page(docs endpoint) with preloading everything: 75.31 (57.24 MB net)

@staabm
Copy link
Contributor

@staabm staabm commented Nov 7, 2018

So you need more then double amount of ram. In other words you can serve only less then half the users in comparison to non-preloaded with the same server

@Toflar
Copy link
Contributor Author

@Toflar Toflar commented Nov 7, 2018

Well, that's not a fair comparison. I called just one endpoint. I would need to call the other endpoints that use other components one after the other if we wanted to find out the percentage of "useless cached files". I think I'd get a lot closer to the 57.24 MB if I did that 馃槉

@staabm
Copy link
Contributor

@staabm staabm commented Nov 7, 2018

Still a lot of overhead. More then I would like to pay to call the feature usefull when loading all the things

IMO loading everything only works for small apps

@Seldaek
Copy link
Member

@Seldaek Seldaek commented Nov 7, 2018

I appreciate your enthusiasm, buuut I don't think any of this is composer's responsibility. I fully agree that it's most likely app specific what you want to preload, but there are two things to consider in my proposal:

  • the preload autoload doesn't have to be defined by every package. Typically I'd expect symfony to define that, or maybe some crypto lib that highly benefits from preloading assuming we get JIT optimizations in there.
  • you don't have to point your php.ini to the preload.php file that composer generates. If you want a custom one, build a custom one. Frameworks definitely should offer tools for that.

Lastly regarding the option of preloading the whole classmap, that's a 3-line foreach loop that you can write as your own preload script loading all classes from composer's classmap if you are so inclined to waste memory ;) It could also be offered as a package toflar/preload-all-the-things that has a preload autoload which then goes and includes all files from the classmap.

So my position at least for now is to either do some very simple thing to facilitate things and "standardize" on a vendor/preload.php file, or alternatively we do nothing at all.

@stof
Copy link
Contributor

@stof stof commented Nov 7, 2018

I tend to agree with @Seldaek here.

The preloading-everything strategy is already straightforward once you generate a full classmap for the optimized autoloader (which you should do anyway if you care about performance, and you don't need preloading if you don't care).
Any smarter algorithm would have to rely on the structure of the project, and so it might be hard to deal with that in Composer.
In Symfony, it would probably make no sense to have a config in the package deciding which Symfony files are preloaded for all Symfony projects. But Symfony could be generating a preload files for projects using it (and so with class coming from Symfony, but also from other dependencies or from the project itself) based on some heuristic. this is being suggested in symfony/symfony#29105 (comment) (I took the example of Symfony here, because I know about the current state of the discussion here, but the same could apply for other frameworks of course).

@Toflar
Copy link
Contributor Author

@Toflar Toflar commented Nov 7, 2018

BTW: I'm not inclined to waste memory at all. It was just one variant that came to my mind and so I elaborated on it. I think it's good to consider multiple approaches, also for the people that read the issue later on. I'm perfectly fine with having the bad ones ruled out 馃槃

We could allow the preload section only in the root composer.json (so project specific) and that's where you specify the files you like to be included. Whether or not you use other files that are dynamically generated by e.g. Symfony is your business then. But the important feature here would be Composer that lets you aggregate out of the box (and maybe we can get the default php.ini setting to be set to vendor/preload.php in php itself which would just be ignored if it doesn't exist 馃槃).

@stof
Copy link
Contributor

@stof stof commented Nov 7, 2018

@Toflar if it is root-only, why asking to put them in composer.json so that composer requires them in another file that you can then reference in your php.ini ? You could reference your own file directly in the php.ini.

@Toflar
Copy link
Contributor Author

@Toflar Toflar commented Nov 7, 2018

I know 馃槃 The only thing it would do is somewhat "standardizing" the way to do it. Nothing else 馃槉

@aenglander
Copy link

@aenglander aenglander commented Nov 7, 2018

I would think that creating levels for autoloading like logging levels would allow for some sort of "automated" control with the library providing the files/classes that would be appropriate for the level. An uber level like "all" could include all files including the files for libraries not supporting the levels. Sounds like something that might fit into the PHP-FIG realm of discussion as well.

@Crell
Copy link

@Crell Crell commented Nov 7, 2018

Some disorganized thoughts:

  1. A way to whitelist files to preload needs to include regex support; in practice, most significant applications load hundreds of classes on every request. I'd rather eat the cost of preloading a few more than I need than having to list them all out manually.

  2. One important question is that, AIUI, the increased memory usage is shared, isn't it? Viz, if we preload 50 MB worth of classes, does that increase the memory usage of every request by 50 MB or does it increase the base cost of having FPM processes by 50 MB but net reduce the memory per request? (I honestly don't know here, but it's a distinction worth verifying.)

  3. Since Composer is basically the universal autoloader at this point, is there a way that Composer can assist in determining what good candidates are to preload? I'm not entirely sure how it would do that without writing data to disk, which is probably undesireable, but if we're going to say "site owners, this is your job" we should try to give them enough information to make that job really easy. What they're going to want is a list of the X most loaded classes, or the classes used on more than Y% of requests, or something like that. Is that something Composer can help compute, and if not, what would?

  4. While I'm sympathetic to the simplicity of "preload all the things", there are some files that MUST NOT be preloaded. For instance, at Platform.sh we use a composer-loaded file (not class) to execute code before the application initializes. That lets us map host-provided environment variables into application-expected environment variables. The process works pretty well but that code needs to be run on every request, so the file has non-symbol-definition code (viz, it violates PSR-1), and so preloading it would break things. So it's probably useful for any auto-preload-builder thingie (Composer or otherwise) to include a way to let packages blacklist certain files that should never be preloaded.

  5. I don't really see a place for FIG here; What would be in scope for FIG would be "hey packages, here's how you expose your preload info". But really, even if we decide that's a package's job to do (and it may be), 99.999947% of the time that will be via composer.json, which is out of FIG's purview.

@pmmaga
Copy link

@pmmaga pmmaga commented Nov 7, 2018

I totally agree that preload should be a separate section from autoload as they are different concepts. For one, you can use an autoloader in your preload script:

<?php

spl_autoload_register(function($name) {
    include_once("$name.php");
});

use Foo;

About what should be preloaded and not, keep in mind that to refresh the preloaded files you must restart php. I think the most typical use-case would be to preload your vendor but keep your application out of it so you can deploy changes to your application without a server restart.

@Toflar
Copy link
Contributor Author

@Toflar Toflar commented Nov 8, 2018

One important question is that, AIUI, the increased memory usage is shared, isn't it? Viz, if we preload 50 MB worth of classes, does that increase the memory usage of every request by 50 MB or does it increase the base cost of having FPM processes by 50 MB but net reduce the memory per request? (I honestly don't know here, but it's a distinction worth verifying.)

That's the same question I was asking myself. Because I understand it the way that you preload stuff that would be loaded into memory later on anyway. Just not on every request but shared. Which in turn means that "preloading all the things" doesn't effectively change the amount of memory used except for the percentage of classes that are not needed aka classes that are shipped with a library but never used within the context of the app.
But not quite sure either but very important to know, indeed.

@stof
Copy link
Contributor

@stof stof commented Nov 8, 2018

@Toflar your response seems to assume that all your requests are using all the classes from the codebase (so used by the project is the same than used by the request). For most projects, that's not the case. Many classes are used only by some specific requests rather than by all of them.

@teohhanhui
Copy link

@teohhanhui teohhanhui commented Nov 8, 2018

If Composer does decide to do something, it should be root-only. Letting your dependencies decide what to preload is not helpful.

@BenMorel
Copy link

@BenMorel BenMorel commented Nov 14, 2018

One important question is that, AIUI, the increased memory usage is shared, isn't it? Viz, if we preload 50 MB worth of classes, does that increase the memory usage of every request by 50 MB or does it increase the base cost of having FPM processes by 50 MB but net reduce the memory per request? (I honestly don't know here, but it's a distinction worth verifying.)

Like @Crell and @Toflar, I was asking myself the same thing. I highly doubt that the memory used by preloading classes is copied to every single PHP process, I guess it's shared memory (to be verified), so preloading the whole stuff would "only" eat ~100MB, but every PHP process thereafter maybe eats less memory?

I'm personally in favour of having composer.json generate a preload.php script that contains by default all the files from the preload section of the project itself and the vendor dependencies.

I don't mind if there's a way to exclude some files explicitly, though, if these files are most likely never used in the average project. But I'm not sure whether any such file would actually belong to the autoload section then, they would probably be dev classes used in autoload-dev only?

While I'm sympathetic to the simplicity of "preload all the things", there are some files that MUST NOT be preloaded. For instance, at Platform.sh we use a composer-loaded file (not class) to execute code before the application initializes. That lets us map host-provided environment variables into application-expected environment variables. The process works pretty well but that code needs to be run on every request, so the file has non-symbol-definition code (viz, it violates PSR-1), and so preloading it would break things. So it's probably useful for any auto-preload-builder thinill gie (Composer or otherwise) to include a way to let packages blacklist certain files that should never be preloaded.

As far as I understand it:

  • you don't have to execute the preloaded files, you can just opcache_compile_file() them; I would suggest that the preload.php file is just a big list of opcache_compile_file() statements;
  • To be verified: I wouldn't expect the code in the preloaded files to not execute. As I understand it, class definitions in such files are cached, but if you explicitly include a preloaded file, I think the code besides the classes will execute anyway.

The only issue I can think of, for your use case, is that class_exists() will return true because the file has been preloaded, so this might not trigger the autoloader; you would therefore have to include() the file explicitly. I would be curious to see what your motivation is for mixing class declarations with other code, though: should this鈥攗sually not recommended鈥攁pproach prevent composer from doing the right thing for most other users? No offense here, but I think that composer should aim to support out of the box the recommended approach, and maybe your slightly exotic approach should require a custom preloading script?

@Toflar your response seems to assume that all your requests are using all the classes from the codebase (so used by the project is the same than used by the request). For most projects, that's not the case. Many classes are used only by some specific requests rather than by all of them.

Then preloading is not for these projects, as explicitly mentioned in the RFC:

And also, this approach will not be compatible with servers that host multiple applications, or multiple versions of applications - that would have different implementations for certain classes with the same name - if such classes are preloaded from the codebase of one app, it will conflict with loading the different class implementation from the other app(s).


As such, here is how I would implement preload.php (my 2 cents):

<?php

// list here all the files that are generated by the optimized autoloader

opcache_compile_file(...);
opcache_compile_file(...);
opcache_compile_file(...);
...

This would be the kind of file that would be generated with no extra configuration. It's there, you can use it if you want, but you don't have to.

Optionally, I would add a preload-exclude or equivalent composer.json option, that allows to exclude individual files, or entire directories, or even vendor dependencies.

For those having multiple versions of an application / dependency: to reiterate, I would advocate to not use preloading at all. Or if you're feeling brave enough, fiddle with preload-exclude or write your own preload script.

For all other users out there, just including the auto-generated preload.php will work like magic.

@Toflar
Copy link
Contributor Author

@Toflar Toflar commented Nov 14, 2018

Maybe we can get @dstogov to help us out to make the right decision for the PHP community here 馃槉

@teohhanhui
Copy link

@teohhanhui teohhanhui commented Nov 14, 2018

@BenMorel:

@Toflar your response seems to assume that all your requests are using all the classes from the codebase (so used by the project is the same than used by the request). For most projects, that's not the case. Many classes are used only by some specific requests rather than by all of them.

Then preloading is not for these projects, as explicitly mentioned in the RFC:

And also, this approach will not be compatible with servers that host multiple applications, or multiple versions of applications - that would have different implementations for certain classes with the same name - if such classes are preloaded from the codebase of one app, it will conflict with loading the different class implementation from the other app(s).

You've misunderstood @Toflar's comment about some classes only being used in some requests of the application vs multiple applications / multiple versions of applications. They're not the same thing at all.

@BenMorel
Copy link

@BenMorel BenMorel commented Nov 14, 2018

@teohhanhui I don't think I misunderstood @Toflar's comment, I was quoting @stof who was himself replying to @Toflar.

To clarify my thoughts:

  • only some classes being used in some requests of the application:
    the project would still benefit from preloading all classes (unless memory is an issue, which I highly doubt)
  • multiple applications with different versions of the same dependency / multiple versions of applications:
    I would not use preloading at all, or leave it to the user to create their own preloading script if they know what they're doing.

@dstogov
Copy link

@dstogov dstogov commented Nov 14, 2018

Maybe we can get @dstogov to help us out to make the right decision for the PHP community here blush

I think, preloading is a very new feature to immediately implement its support in composer.
Adaptation of applications and frameworks for preloading should identify best solutions, missing functionality, etc.

I tried preloading the whole frameworks (ZendFramework) and application specific preloading (getting the list of used PHP scripts through opcache_get_status() and generating a list of opcache_compile_file(...)). The second approach works better.

Read about usage of Java Class Data Sharing. We implemented similar technology, and may borrow use cases.

@BenMorel
Copy link

@BenMorel BenMorel commented Nov 14, 2018

I tried preloading the whole frameworks (ZendFramework) and application specific preloading (getting the list of used PHP scripts through opcache_get_status() and generating a list of opcache_compile_file(...)). The second approach works better.

Thanks for jumping in, @dstogov!

Could you please explain what you mean by works better? Did you get better performance? Did preloading all the classes take up too much memory? Did you have any issue?

@Crell
Copy link

@Crell Crell commented Nov 14, 2018

@dstogov Can you clarify the question above regarding the memory usage of a preloaded class? Viz, if I have 100 classes that are used on virtually every request anyway, and I then preload all of them, we know that's going to save CPU time. However, is it going to increase, decrease, or have no effect on memory usage?

Similarly, if we preload 10 MB worth of code that is used only on a small fraction of requests, and there are 10 concurrent requests, have we now increased total memory usage by 10 MB (shared memory) or 100 MB (cost in each process)?

@Crell
Copy link

@Crell Crell commented Nov 14, 2018

@BenMorel The example from Platform.sh is not a class at all. It's a file that looks something like this:

<?php

function stuff() {
  $_ENV['db_name'] = $_ENV['dbname'];
}

stuff();

(Because the application wants an environment variable with one name and our system by default provides it with another. This is a very over-simplified example but it gets the idea across.)

That file is then included by composer so it runs during autoload, before the application looks for its environment variables. There's nothing intrinsically wrong with that approach and it works quite well right now. My point is that such a file MUST NOT be preloaded, because then it won't actually run on subsequent requests and break the application. We don't need to do anything special here for it other than make sure that it doesn't get picked up and preloaded accidentally by whatever mechanism Composer ends up using.

@Seldaek
Copy link
Member

@Seldaek Seldaek commented Nov 15, 2018

@Crell preloading such a file would have zero effect the way I understand it. As it doesn't declare a preloadable class it is ignored, and will be executed at runtime when included.

@DarkGhostHunter
Copy link

@DarkGhostHunter DarkGhostHunter commented Jul 20, 2019

I am highly skeptical that Composer is the right place to do hot-path analysis to determine the optimal set of files to preload. Activating preload requires an ini setting as @BenMorel notes, so really all Composer would be able to do is generate a script that preloads everything (which you can then use or not) or allow libraries to declare "preload these files", and then generate a script that preloads just those.

Anything more complex would be, I think, way out of scope for Composer itself. (Maybe an extension?) Let's not give poor Jordi and Nils a task of implementing autoload machine learning, k? 馃槃

If that so, there should be no problem to add a composer preload to generate a list to be preloaded as preload.php. Libraries could declare which files to declare as "preloadable" under a key.

{
    "preload": {
        "psr-4": [
            "ServerUtils\\PingTools\\",
            "ServerUtils\\TracertManager"
        ],
        "files": [
            "helpers.php"
        ]
    }
}

If a library isn't preloadable, the developer can "add" a library to the autogenerated list using the proyect roots composer.json:

{
    "preload": {
        "psr-4": [
            "OldPackageNotPreloaded\\",
            "OtherNotPreloaded\\CommonClass"
        ],
        "files": [
            "my-app-helpers.php"
        ]
    }
}

The developer can use this list for development for a quickstart, since vendor files shouldn't be risky to preload since these won't change. On production, the developer should ask OPCache for analytics and preload the most performance list of classes for his application.

Should be that enough? Composer will only help to make a preliminary preload list, but the final decision is sill in the developers' hand.

@teohhanhui
Copy link

@teohhanhui teohhanhui commented Jul 20, 2019

@DarkGhostHunter As have been pointed out multiple times in this thread, just preload everything. No need for such unnecessary configurations that will most likely end up in preloading too little anyway, thereby defeating the whole purpose.

@Ayesh
Copy link
Contributor

@Ayesh Ayesh commented Jul 20, 2019

I also agree with what @Crell said about making composer determine the autoload files. This can easily increase the complexity of Composer and should not be part of a dependency manager.

The plugin I shamelessly self-plugged is also trying to generate a preload.php file based on the composer.json directives. With current Opcache statistics limitations, I don't think it's even possible to reliably list and stat opcache files across different SAPIs (cli to fpm, etc).

@DarkGhostHunter
Copy link

@DarkGhostHunter DarkGhostHunter commented Jul 20, 2019

@DarkGhostHunter As have been pointed out multiple times in this thread, just preload everything. No need for such unnecessary configurations that will most likely end up in preloading too little anyway, thereby defeating the whole purpose.

On a development machine it could work since there are no high RAM limitations like in a production machine.

@CDRO
Copy link

@CDRO CDRO commented Aug 1, 2019

I really like the idea of the composer.json preloading proposal, since it allows the package developer to tell which classes or files to preload. But it makes only sense for small packages that developers write for themselves and their current project and use case, not for frameworks like Symfony, TYPO3, Drupal or Laravel, which cannot know, what the developer using the framework will actually use from their framework.

To me, the most sensible thing would be to develop a PHP package to analyse the code base of your project and generating the preload.php accordingly. This could be a composer plugin (or maybe something other that then gets integrated into a composer plugin) that will take any number of folders to check and try to resolve all the used classes and generates the file accordingly.

This solution would have to rely on composer and the autoloader though, since there is no point in rewriting an autoloader just for this use case.

This would come in handy for developers, since they would not have to care and think about what they would actually have to provide (and based on myself, not having to care about stuff is always awesome).

Any thoughts?

EDIT: I'm a bit concerned about the whole preloading stuff too, since having to restart my php-fpm service means downtime, which might not be acceptable in some cases (even if we speak about 0.26-1.5 secs)

@pjona
Copy link

@pjona pjona commented Sep 3, 2019

@CDRO maybe with the time, it will be changed and you will need to only reload php-fpm and not restart it (so it will not kill existing requests). From the other hand if you are not accepting 0.26-1.5 secs then probably you should have already HA (multiple servers setup), which will allow you to remove a member from the pool, restart PHP and add member again and then there is no downtime.

@rask
Copy link

@rask rask commented Sep 3, 2019

I though FPM reload was a graceful restart, meaning workers (that have preload data in memory) are killed and restarted once the current request has been handled through. Restart would just kill the in-progress request and return an error to the client.

@CDRO
Copy link

@CDRO CDRO commented Sep 4, 2019

If a reload does what @rask suggests, this would be indeed perfect, maybe @nikic has a better overview if it's like we expect it to be.

@pjona you're right, if this is an issue, HA should already be implemented, on the other hand this can easily be solved with deployment windows too, where it is accepted that the application might be down for some short time.

@nikic
Copy link

@nikic nikic commented Sep 4, 2019

An FPM reload will not clear the preload state, you do need a full FPM restart.

@rask
Copy link

@rask rask commented Sep 4, 2019

@nikic I see. FPM workers receive a baseline exec env from the process manager, which is what must be restarted for the workers to receive a new exec env properly?

@luispabon
Copy link

@luispabon luispabon commented Sep 29, 2019

I'm not entirely sure it is possible to generate a useful list of stuff to preload without any execution stats, unless composer is somehow analysing the codebase and its real points of entry to see what's actually being used and what not. This is something a static analyser is probably better suited to do as some of the tooling required should already be there.

I'm currently custom-creating the preload file from real world opcache stats after letting the app rI'm currently working on run for a few days in the wild (wild meaning automated traffic as the app is still in dev). This solution does work and can potentially be automated to some extent, but it'd be a custom job each time.

I'm not convinced composer is the right tool for this particular job.

@luispabon
Copy link

@luispabon luispabon commented Sep 29, 2019

On the subject of preloading everything, and taking benchmarks above such as they are, the extra memory used by opcache can be very problematic when you're running your apps on highly promiscuous environments with very tight memory constraints. For instance, kubernetes. Any one node can be sharing a meagre 4GB of ram with 15 or 20 pods which resource limits and requirements have been tightly adjusted.

@DarkGhostHunter
Copy link

@DarkGhostHunter DarkGhostHunter commented Nov 26, 2019

I think this is the preloading we are looking for: offer something basic, but let the developer expand on it.

{
  "preload": {
    "entrypoint": "entrypoint.php",
    "script": ["foo.php", "bar.php"],
    "directories": ["examples", "foo/bar"],
    "files": ["helpers.php"],
    "ignore": ["src/foo.php", "src/bar.php"],
}

This gives 100% flexibility on what to preload:

  • Need an entrypoint in your project? Let Composer build it and point it in php.ini.
  • Use a script? Done.
  • Just everything inside a directory? Done.
  • Some files? Done.
  • What to ignore some files on top of what you added? Done.

The procedure

Preloading means editing php.ini. The procedure should be first point PHP to include the entrypoint of the project root. That entrypoint should be handled by Composer. It will link all the preloading scripts from dependencies (or build them) into one file, which is the entrypoint, with one command:

composer preload build

That will cycle every package for a preload key and add the scripts, directories and files (and ignored files) to a compiled real entrypoint. These are cached inside the composer bootstraping.

Ideas?

@TonyVlcek
Copy link

@TonyVlcek TonyVlcek commented Nov 27, 2019

@DarkGhostHunter I'm new here and don't see under the hood of composer. But from the perspective of a mere user, this looks good to me and like something I could work with 馃憤

@DarkGhostHunter
Copy link

@DarkGhostHunter DarkGhostHunter commented Nov 27, 2019

@DarkGhostHunter I'm new here and don't see under the hood of composer. But from the perspective of a mere user, this looks good to me and like something I could work with 馃憤

While my suggestion will allow for automatic preloading, there is still progress to be made on preloading only the "hot" files. There should be a way to save OPCache analytics about what files are hit the most, and push a part of the list based on memory constraints or percentage threshold. If composer could do part of that job, it would be awesome

The later matters because you may preload a project with 1500 files, but you may get almost the same performance for 99% of requests with just 150 files. That you you could instance 10 more PHP instances instead of just one.

@CDRO
Copy link

@CDRO CDRO commented Nov 29, 2019

I think everybody agrees that the most optimal way to generate the preloading is to gather information via opcache and load only the files needed for the project.

IMHO, since most projects using composer will probably build their application on a deployment server and then push the app/website to the production server, composer preload will not be able to make use of the opcache statistics (in these cases at least).

But what if it could actually access this information?

I could imagine the following solutions regarding these issues:

  • either create a composer package that would gather the statistics somewhere on the server and automatically/manually pull these informations on composer update/composer preload
  • create a secure endpoint accessible with a secret key only known by composer and stored somewhere in the composer.lock oder composer.json file that allows composer preload to make a request to gather the information to build the preload script

Would this be a viable solution?

@Seldaek
Copy link
Member

@Seldaek Seldaek commented Nov 29, 2019

You are welcome to keep the discussion going here as a central point for people interested in the topic to coordinate. But just to be clear, I am fairly confident that in the near future we are not going to add anything to Composer relating to preloading.

If in a year it turns out - after people have been playing with it - that there is something Composer is uniquely positioned to really help with, we can revisit. For now it seems to me much more like an application/deployment concern than a dependency management one.

@Seldaek Seldaek removed this from the 2.0 milestone Nov 29, 2019
@Seldaek Seldaek added this to the Low-Prio / Controversial milestone Nov 29, 2019
@kapitanluffy
Copy link

@kapitanluffy kapitanluffy commented Dec 14, 2019

I think the problem here is we are trying to preload everything.

What if we give the responsibility to package developers instead? They can declare classes/files that needs to be cached in the composer.json.

{
    "preload": [
        "/package/AbstractClassInterface.php",
        "/package/AbstractClass.php",
        "/package/helpers.php",
    ]
}

Package developers are responsible for these files. They should not have dependencies (or at least include it in preloading). Composer will detect these declared files and automatically create a preload script which we can optionally use.

@DarkGhostHunter
Copy link

@DarkGhostHunter DarkGhostHunter commented Dec 14, 2019

@NickSdot
Copy link

@NickSdot NickSdot commented Jan 10, 2020

Trying to see this from a simple users perspective.

I think I read trough all the comments here. Did I miss something or really nowhere this scenario was mentioned?

A user is hosting on a shared web host or a server he hasn鈥檛 much control of. And the user is using any kind of software which is using composer. Let鈥檚 imagine in each package or on root level the preloading behavior is defined. The user even don鈥檛 know about it.

Well, yes it鈥檚 true, there still is the php.ini step which prevents from an accidental activation of PHPs preloading behavior (if there is not such a thing like a 鈥榙efault link鈥 to composer/preload.php from PHP itself in future, which Toflar was mentioning before).

But what if a user is activating it because he found out about it in any kind of documentation, but is not aware about the full result (good and bad) of it?

How the user will be able to get the pre-loaded files out of its memory in its shared host?

In my opinion activation of preloading should urgently be based on a strong opt-in. For instance by being required to explicitly install the required code within a separate package which is more complicated and prevent from 鈥歜y accident activations鈥. Neither Composer nor a package maintained should decide this for the user without it鈥檚 acknowledge. Isn鈥檛 it?

Of course this doesn鈥檛 mean that a preloader should not use the Composer generated class map.

@DarkGhostHunter
Copy link

@DarkGhostHunter DarkGhostHunter commented Jan 10, 2020

@NickSdot You can't fix naiveness and irresponsibility with a Composer package. I agree, but the technique to properly make an optimal preload list is beyond Composer, so any point here apart from just seeing how the preload progresses in time is moot, imo.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet