Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[4.x] When an update fails #37870

Closed
brianteeman opened this issue May 23, 2022 · 10 comments
Closed

[4.x] When an update fails #37870

brianteeman opened this issue May 23, 2022 · 10 comments
Assignees

Comments

@brianteeman
Copy link
Contributor

Just tried to do an update on a server and got the following error message

image

I am not reporting that this error occurred

The problem is that when this occurs you can not log out or visit any other page in the admin as there is no navigation etc as seen below after closing the error message

image

ping @nikosdion as requested

@richard67
Copy link
Member

@brianteeman Does a page reload help?

@brianteeman
Copy link
Contributor Author

No that just reloads the error box

@nikosdion
Copy link
Contributor

Remember that any buttons rendered on that page have to be rendered BEFORE the update starts. Here are two problems for you.

Let's say no problem occurs BUT you click on the Back button before the update finishes. Your site at this point is half-upgraded. In other words it's bricked. Even if it's still usable you may not be able to run the upgrade again if the Version.php file has been overwritten. So, displaying a button would be problematic.

When the error handler runs we DO NOT know what the problem is, we DO NOT know how much of the update has run (are we half-updated?), we DO NOT know if it's safe to reload the site. That's why we show a message with a link to a documentation page and stay there.

Yes, Brian, even your error can occur in the middle of an upgrade. In some cases a stupid host's configuration may see the consecutive calls to the extract.php file from the same IP and block that IP for a few minutes. You'd still get a 403 error. So even with this ostensibly "safe" error message there is at the very least one unsafe condition which doesn't let us go back to the site.

Instead of making (wrong) assumptions about whether the site's state is safe or not I'd rather have the user stuck there and try to follow the documentation or close the tab and try to reload the site's backend (in which case a partial update becomes evident and they will HAVE to read the documentation or ask for help in the forum).

So, no, it's not an oversight... it's just a lack of any good options at this point. Ideally we'd make the popup impossible to close but I can't see such an option in Bootstrap.

@jeckodevelopment
Copy link
Member

The "ideal" scenario would be to revert the work done on the update, restoring the starting condition.
But this implies that you backup the initial condition and you're able to restore it.

@nikosdion
Copy link
Contributor

@jeckodevelopment That's why we tell users to take a backup of their site before updating. You really cannot do that automatically. It would require knowing all the files you are a. adding b. modifying and c. deleting. Then you'd need to take a backup of them. Even backing up these files, a mere 50MiB or thereabouts is nowhere near as simple as it sounds.

It is far more productive for the project asking people to take a backup using any method they are familiar with (e.g. host's backup feature, an extension like Akeeba Backup Core or XCloner, a ZIP/tar of the core directories, ...) and restore the site to its prior state if the update bombs.

I can also tell you based on my experience having the automatic Backup on Update in Akeeba Backup Professional that even if you take everything into considerations there ARE times when the backup before the update will fails, ranging from the mundane (ran out of disk space) to the outright arcane (CoreLinux kernel, unlike the vanilla Linux kernel, will NOT free up filesystem cache memory when memory pressure rises, leading to the server killing further web requests for the next several minutes). You DO NOT want to have to deal with that on the CMS maintenance level. Very few people have the necessary experience to do that and nobody will want to do that for free. Not to mention that the release cadence of the CMS is glacial compared to what is needed to keep up with the new ways hosts find out to inflict problems optimise their servers.

@jeckodevelopment
Copy link
Member

jeckodevelopment commented May 25, 2022

I totally agree with you Nicholas.
I understand the problem and agree that from a UX perspective, there could be some improvements.
But variables are too many to have an only solution.
Suggesting backup is the way to go, imho.

@nikosdion
Copy link
Contributor

I've been thinking about this.

On error we could hide the extraction progress and replace it with the error message (instead of a modal). At the bottom of the message we could have standard text like “You can try to go back and retry the update. If your site has become inaccessible due to an incomplete update please restore it from a backup and retry the update.” with a "Back" button to take you back.

This is pretty much what I do when there is an error taking a backup.

Another thing I am doing when a backup fails is three retries. The first time an error occurs I show the error message and a 10 second countdown. At the end of the countdown the last step of the backup (in our case: extraction of the update) is retried. Up to three retries are made before the error becomes permanent and a show-stopper in which case what I described above happens.

The idea behind the retries is that some hosts may simply block your IP temporarily after seeing that you are hitting the same URL from the same IP address repeatedly. For maximum compatibility with cheap / restrictive hosts we could set the retry timeout to 60 seconds. This would allow recovery of some typical problems.

Finally, along the Back button we could also have a Restart button which tries restarting the extraction from scratch. Use case: you have tweaked your .htaccess file and accidentally made accessing extract.php return a 403. Instead of having to go back to re-downloading the update and retrying the extraction you can just restart the extraction with the update file you already have.

This would bring Joomla Update's UX in line with the UX improvements I had made in my company's software in 2015 after doing a user study (we paid newbies, intermediate and experienced users to use our software to accomplish set goals while being recorded and talking us through their thought process).

Any ideas welcome.

@jeckodevelopment
Copy link
Member

I like the approach, it's already known by most users (almost everyone has used Akeeba Backup in the Joomla-sphere).
Reading your idea I was just thinking to users that tweak PHP parameters/limits between attempts (e.g. increasing max_execution_time, memory_limit, max_input_time and so on) or, as you mentioned, the .htaccess file.
I see the idea as consistent with expectation... if we have to be honest, everyone when some procedures fail, just attempt to execute it again.
Thanks Nicholas for that.

@nikosdion
Copy link
Contributor

@jeckodevelopment Thank you for reminding me! When taking a backup I am setting a ridiculously high PHP maximum execution time (3600 seconds) and a ridiculously high memory limit (1024MiB). I could definitely do that for the extraction. The format might indeed make a difference, the latter is very unlikely unless the host has an unreasonably small memory limit which wouldn't let Joomla operate (think about 16MiB or less). But, uh, there's no harm in doing so as my code correctly detects when it's safe / possible to do that.

I have a lot of open work right now, both for my business and for Joomla. When things start slowing down enough for me to get some breathing room I'll make a PR for that.

Can a maintainer please assign this issue to me so it appears on my GitHub to-do list (issues assigned to me across all organisation)?

@brianteeman
Copy link
Contributor Author

Closing as it was addressed with #38002

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants