Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The autoloader cache can keep an invalid entry for a class after an update #11290

Closed
danxuliu opened this issue Sep 19, 2018 · 6 comments
Closed
Labels
0. Needs triage Pending check for reproducibility or if it fits our roadmap bug needs info stale Ticket or PR with no recent activity

Comments

@danxuliu
Copy link
Member

This is a corner case found during development; I do not know if it could happen in the real world or if the regular updater has means to prevent this (although #11225 looks similar), but in any case better have it documented.

In short, if a class that does not exist is queried it is stored as non-existent in the autoloader cache, but if the class is later introduced the cache still lists it as non-existent and thus the class can not be loaded. For the excruciatingly long explanation, see below ;-)

First, the steps to reproduce the problem:

  • Checkout the appropriate commits
    • On a Nextcloud Git directory, checkout Nextcloud 14.0.0 with git checkout v14.0.0 && git submodule update --recursive
    • If you do not have it already, clone Nextcloud Talk in the apps directory with git clone https://github.com/nextcloud/spreed
    • In the apps/spreed directory, checkout the commit before RoomShareProvider was introduced with git checkout 220726000^
  • Create a Docker container for the Nextcloud server
    • Start a Docker container for the Nextcloud server in the Nextcloud Git directory with docker run --rm --volume /PATH/TO/THE/NEXTCLOUD/GIT/DIRECTORY/:/var/www/html --volume /var/www/html/data --volume /var/www/html/config --publish 8000:80 --interactive --tty --name nextcloud-class-cache-bug nextcloudci/acceptance-php7.1:acceptance-php7.1-2
      • The image for the acceptance tests is used because it provides Apache and everything needed to run Nextcloud from a specific directory in the host, but note that the problem is in no way related to the acceptance tests.
    • The previous command would have opened a Bash session in the container; start Apache in that session with service apache2 start
  • In a different terminal, setup the Nextcloud server
    • Ensure that the Apache user will be able to write in config and data with docker exec -it nextcloud-class-cache-bug chown -R www-data:www-data /var/www/html/config /var/www/html/data
    • Install the Nextcloud server with docker exec -it --user www-data nextcloud-class-cache-bug bash -c "cd /var/www/html && php occ maintenance:install --admin-pass=admin"
    • Enable Nextcloud Talk with docker exec -it --user www-data nextcloud-class-cache-bug bash -c "cd /var/www/html && php occ app:enable spreed"
  • Log in to Nextcloud
    • In a web browser, open http://127.0.0.1:8000 and log in with user admin and password admin; you should see the Files app
  • Checkout a newer commit of Talk
    • In the apps/spreed directory, checkout the commit in which RoomShareProvider is (implicitly) used in the constructor of the PageController of Talk with git checkout fbe65943
  • Start the update
    • In the web browser, reload the page; you should see the web updater
    • Click on Start update; you should see (eventually) the Files app
  • Trigger the bug
    • In the web browser, click on the Talk icon in the list of apps; you should see an Internal Server Error page

If you then check the Nextcloud log you will see the following error message: OCP\AppFramework\QueryException: Could not resolve api! Class api does not exist. The exception is thrown when trying to instantiate the PageController but failing to instantiate one of its constructor parameters. However, although the message is correct, it is misleading in two different ways.

When the SimpleContainer tries to get the constructor parameters it first tries to query the class name (if available) and, if that fails, then it tries to query the name of the parameter itself; the parameter that can not be instantiated is RoomController $api, so although it is true that Class api does not exist the problem comes from not being able to instantiate RoomController in first place instead.

However, the real problem is not in RoomController either; RoomController can not be instantiated because one of its constructor parameters can not be instantiated, and, in turn, that parameter can not be instantiated because one of its constructor parameters can not be instantiated because the class of the parameter, RoomShareProvider, is not found.

In the above steps, if the container is started with Talk already in commit fbe65943 then RoomShareProvider can be found and everything works as expected. So, why is RoomShareProvider not found if Talk is updated instead?

If APCu is available (if installed, it is enabled by default), the main class loader of Nextcloud server tries to fetch classes from the cache when they are not found in its class map (like those in external apps without its own class loader). Therefore, the classes in Talk, like OCA\Spreed\Share\RoomShareProvider, are returned from the main autoloader cache if found there.

If the class is not found then it is looked for it and the result is saved in the autoloader cache. However that is done in any case, even if the file is not found. And here comes the problem. In the above steps, when logging in for the first time the share provider factory queries the RoomShareProvider; it is not found, because in the Talk commit in which the server was started the class did not exist yet, and thus the class is stored as non-existent in the autoloader cache. However, that entry is not cleared after Talk is updated, so even if the PHP file now does exist it is still not found because it is marked as non-existent in the cache.

As a final note it is worth pointing out that the problem does not occur if the server is started without enabling Talk and then Talk is enabled directly in the commit fbe65943. In theory, in that case RoomShareProvider would have been saved as non-existent too when logging in for the first time (although I have not actually verified it), but maybe the cache is cleared when an app is installed and thus the problem does not occur in that case; I have no idea :-)

@danxuliu danxuliu added the bug label Sep 19, 2018
@nextcloud-bot
Copy link
Member

GitMate.io thinks possibly related issues are #10922 (Update to NC14 RC 1 No cache entry found), #10406 (Not all sidebar entries have the "classes" attribute), #7488 (Update caches for federated shares in cron), #10570 (cache root cache entry), and #10642 (Bump autoloaders).

@nickvergessen
Copy link
Member

So to summarize it shortly:
https://getcomposer.org/doc/articles/autoloader-optimization.md#autoloader-optimization

Note: You should not enable any of these optimizations in development as they all will cause various problems when adding/removing classes. The performance gains are not worth the trouble in a development setting.

The actual fix is to restart apcu and kill the cache with that completely.

Anyway, so the autoloader has a --apcu flag which stores whether the autoloader should use apcu at all. If I replace the call in

$COMPOSER_COMMAND dump-autoload -d $REPODIR

and add --apcu there is a diff on the autoloader:

diff --git a/lib/composer/composer/autoload_real.php b/lib/composer/composer/autoload_real.php
index b9f89d16ad..9b601da0a7 100644
--- a/lib/composer/composer/autoload_real.php
+++ b/lib/composer/composer/autoload_real.php
@@ -45,6 +45,7 @@ class ComposerAutoloaderInit53792487c5a8370acc0b06b1a864ff4c
             }
         }
 
+        $loader->setApcuPrefix('5RyxmrkLF+KeYqj7y5Qys');
         $loader->register(true);
 
         return $loader;

without that, the apcuPrefix is never set and therefor the cache should never be used.

So I had a look and tried to see where else this is set and found:

self::$composerAutoloader->setApcuPrefix($instanceId . '-mainComposer');

The problem is, that our key does not change ever at all. Maybe instead it should be updated on version changes (of any app and the server (but that is too deep in the chain I think)) and additionally we should not set it at all for $CONFIG['debug'].

Since this is way to deeeeeeep down a wrong/unsafe road, I propose to revert #9442 for now, which newly introduced this for 14.

nickvergessen added a commit that referenced this issue Sep 19, 2018
This reverts commit 948ab8a.

For details why see #11290
nickvergessen added a commit that referenced this issue Sep 19, 2018
This reverts commit 948ab8a.

For details why see #11290
@skjnldsv
Copy link
Member

@nickvergessen this is fixed then?

@nickvergessen
Copy link
Member

This is fixed, but we should try to get it working properly to save filesystem checks etc

@skjnldsv skjnldsv added the 0. Needs triage Pending check for reproducibility or if it fits our roadmap label Jun 12, 2019
@skjnldsv
Copy link
Member

Please update this issue, either to develop or close :)

@ghost
Copy link

ghost commented May 10, 2020

This issue has been automatically marked as stale because it has not had recent activity and seems to be missing some essential information. It will be closed if no further activity occurs. Thank you for your contributions.

@ghost ghost added the stale Ticket or PR with no recent activity label May 10, 2020
@ghost ghost closed this as completed May 24, 2020
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
0. Needs triage Pending check for reproducibility or if it fits our roadmap bug needs info stale Ticket or PR with no recent activity
Projects
None yet
Development

No branches or pull requests

4 participants