Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dev/core#2607 - Maintain list of deleted files #28653

Merged
merged 3 commits into from
Dec 18, 2023

Conversation

colemanw
Copy link
Member

@colemanw colemanw commented Dec 14, 2023

Overview

Towards addressing the problem that some upgraders leave old files in the tree, which can cause php fatal errors when they continue to autoload, or security issues if the files were deleted for security reasons.

See https://lab.civicrm.org/dev/core/-/issues/2607

Before

System check exists but relies on hand-curated list which no one has been keeping up with.

After

Same system check but uses autogenerated list of files:

image

Technical Details

This autogenerates a list of files and directories which no longer exist, based on git log. I've added it to the set-version script to automatically update the list with every version update.
We could use that list to attempt to auto-remove them during upgrades per this suggestion.

Comments

It's a big list but I was able to shrink the filesize by limiting the history to 4.6.0 and by consolidating deleted directories to a single line like path/name/* instead of listing every file in the directory.

Copy link

civibot bot commented Dec 14, 2023

🤖 Thank you for contributing to CiviCRM! ❤️ We will need to test and review this PR. 👷

Introduction for new contributors...
  • If this is your first PR, an admin will greenlight automated testing with the command ok to test or add to whitelist.
  • A series of tests will automatically run. You can see the results at the bottom of this page (if there are any problems, it will include a link to see what went wrong).
  • A demo site will be built where anyone can try out a version of CiviCRM that includes your changes.
  • If this process needs to be repeated, an admin will issue the command test this please to rerun tests and build a new demo site.
  • Before this PR can be merged, it needs to be reviewed. Please keep in mind that reviewers are volunteers, and their response time can vary from a few hours to a few weeks depending on their availability and their knowledge of this particular part of CiviCRM.
  • A great way to speed up this process is to "trade reviews" with someone - find an open PR that you feel able to review, and leave a comment like "I'm reviewing this now, could you please review mine?" (include a link to yours). You don't have to wait for a response to get started (and you don't have to stop at one!) the more you review, the faster this process goes for everyone 😄
  • To ensure that you are credited properly in the final release notes, please add yourself to contributor-key.yml
  • For more information about contributing, see CONTRIBUTING.md.
Quick links for reviewers...

➡️ Online demo of this PR 🔗

@civibot civibot bot added the master label Dec 14, 2023
@colemanw colemanw force-pushed the deletedFiles branch 4 times, most recently from 89addd1 to 4580e58 Compare December 14, 2023 16:37
@totten
Copy link
Member

totten commented Dec 14, 2023

Cool :)

Note there's a typo in the filename (generage / generate)

(It would be nice if we had a folder like meta/ or config/ or app/ to put this kind of thing in. The xml/ folder is probably closest, but it doesn't really have the same cachet as in 2004...)

@colemanw
Copy link
Member Author

Note there's a typo in the filename (generage / generate)

That's what happens when you stare at something for too long...

(It would be nice if we had a folder like meta/ or config/ or app/ to put this kind of thing in. The xml/ folder is probably closest, but it doesn't really have the same cachet as in 2004...)

You're welcome to move the files around... the deleted/moved files will be automatically picked up by the script and added to the deleted-files-list :)

@eileenmcnaughton
Copy link
Contributor

@colemanw do we need to go as far back as 4.6.0?

@colemanw
Copy link
Member Author

do we need to go as far back as 4.6.0?

Given the long history of these problems I would vote to go back at least that far (if not farther) initially, so we can ensure people's sites really are clean. Then after about a year we can cut the list down to just keep up with CRM_Upgrade_Form::MINIMUM_UPGRADABLE_VERSION (there may be a way to sync it to that constant, for lower maintenance).

@demeritcowboy
Copy link
Contributor

Towards addressing the problem that some upgraders leave old files in the tree, which
can cause php fatal errors when they continue to autoload, or security issues if the
files were deleted for security reasons.

This generates a list of files which no longer exist. We could use that list to attempt
to auto-remove them during upgrades, or at the very least show an alert if they are present.
@colemanw colemanw changed the title Add deleted-files-list dev/core#2607 - Maintain list of deleted files Dec 15, 2023
@colemanw
Copy link
Member Author

@totten I've pushed an update that adapts your orphanedFiles system check to use the new list. One question I wasn't sure of is why was that check setup to look outside the [civicrm.root] directory, but then ignore those other places (seems like multiple layers of patching resulted in not-the-most-straightforward logic perhaps). I set it to only look in [civicrm.root] which is probably all we care about (composer-based setups that have a different file structure would also have composer managing files for them, so the file-deletion ought to be ok).

@demeritcowboy
Copy link
Contributor

I think it's because the check is mixing multiple things so that might be confusing. There is a security check, a joomla-ish check, and then a files-were-moved-from-packages-to-composer-but-for-some-reason-are-still-in-packages check. The files deleted by composer scripts are files you don't want to exist if vendor is in a web-accessible place, because they're not great to expose. Ideally vendor wouldn't be in that spot, but it is a configuration some sites have from before and still works. But if I remember right the trick was determining "web accessible", so instead it excludes drupal 8 to avoid false alerts.

$files[] = '[civicrm.packages]/jquery/plugins/DataTables/media/css/jquery.dataTables_themeroller.css';
$files[] = '[civicrm.packages]/jquery/plugins/DataTables/media/js/jquery.dataTables.js';
$files[] = '[civicrm.packages]/jquery/plugins/DataTables/media/js/jquery.dataTables.min.js';
$files[] = '[civicrm.packages]/jquery/plugins/DataTables/media/js/jquery.js';
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem being solved is this: Merely unzipping the new code release doesn't delete files, esp on Joomla. (It's true anywhere, but Joomla doesn't have alternative workflows to get around the issue.)

AFAICS, the same problem applies equally to civicrm-core and civicrm-packages.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@totten this PR does include civicrm-packages.

$files[] = '[civicrm.vendor]/pear/net_smtp/examples';
$files[] = '[civicrm.vendor]/pear/net_smtp/tests';
$files[] = '[civicrm.vendor]/pear/net_smtp/phpdoc.sh';
$files[] = '[civicrm.vendor]/phpoffice/phpword/samples';
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For someone installing from zip/tar, it's the same issue as civicrm-core or civicrm-packages. (Merely extracting won't delete things.)

On MM, you were brainstorming about whether composer changes the need for checks.

In my mind, the general reason to do run-time sniff-tests on [civicrm.vendor] is to ensure that either composer cleanup steps were executed or that they're not necessary. Reasoning about what composer will or will not do (across the range of configurations/environments/admins) is not always simple. For example, just within D10, you have these two variations in process:

## Typical process for core-contributor is like...
git clone https://github.com/drupal/drupal mysite
cd mysite
composer install
## Typical process for site-builder is like...
git clone https://github.com/drupal/recommended-project mysite
cd mysite
composer install

One has a public vendor (where you do should think about HTTPD policies and example files). The other has a private vendor (where you don't need to think about HTTPD policies or whether example files are web-accessible). For one style, it doesn't matter if you fire Civi's composer-cleanup-scripts. For the other style, it might. (But not necessarily! Depends on HTTPD!) In either case, neither of them do fire cleanup scripts on that folder (at least, by default).

Of course, maybe these are two different things: (1) checking that deleted files are deleted vs (2) checking that non-public files are non-public. If these [civicrm.vendor] things are covered by some other HTTP-level check, then maybe the file-level check would be somewhat redundant?

(Except that we still have the issue where unzipping/untarring a new version of Civi can leave you with old files...)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@totten I'm not sure based on this comment if a change to this PR is needed?

@totten totten merged commit f0506ba into civicrm:master Dec 18, 2023
3 checks passed
@colemanw colemanw deleted the deletedFiles branch December 18, 2023 20:36
@demeritcowboy
Copy link
Contributor

Reports of my death have been greatly exaggerated:

Old files
The local system includes old files which should not exist:
    /home/jenkins/bknix-dfl/build/core-28705-4mly7/web/sites/all/modules/civicrm/ext/standaloneusers/Civi/Api4/RolePermission.php

@totten
Copy link
Member

totten commented Dec 18, 2023

Reports of my death have been greatly exaggerated:

Hrm. So the theory here was that you'd take a snapshot of deleted-files whenever the version changes. And the focus was on install/upgrade, so that seemed fine for those workflows.

But... yeah... for warnings that are emitted on master, perhaps it's going to be prone to misreporting? Would it make more sense to make the list as part of GenCode (eg setup.sh -g) (In that case, the JSON file doesn't need to be committed?)

@demeritcowboy
Copy link
Contributor

I'm not sure it's master-specific, more that this file existed, then didn't, then did again.

@totten
Copy link
Member

totten commented Dec 19, 2023

(@demeritcowboy) I'm not sure it's master-specific, more that this file existed, then didn't, then did again.

Was it race amongst PRs?

Would it make more sense to make the list as part of GenCode (eg setup.sh -g) (In that case, the JSON file doesn't need to be committed?)

Correcting myself: if the check is to run on all environments (incl D8+), then yes, it needs to be committed.

@demeritcowboy
Copy link
Contributor

Oh, maybe. Looks like the resurrection was 4 days ago.

@seamuslee001
Copy link
Contributor

@demeritcowboy but I thought the resurrection was due in the first half of the calendar year

@demeritcowboy
Copy link
Contributor

Timezones?

@colemanw
Copy link
Member Author

If the file was resurrected after this file was last generated, then you'll get the warning. Regenerating the file would fix it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants