Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle file install failures more gracefully #1534

Merged
merged 9 commits into from Feb 16, 2021

Conversation

pmatilai
Copy link
Member

Run the file installation in multiple stages:

  1. gather intel
  2. unpack the archive to temporary files
  3. set file metadatas
  4. commit files to final location
  5. mop up leftovers on failure

This means we no longer leave behind a trail of untracked, potentially harmful junk on installation failure.

If commit step fails the package can still be left in an inconsistent stage, this could be further improved by first backing up old files to temporary location to allow undo on failure, but leaving that for some other day.
Also unowned directories will still be left behind.

And yes, this is a somewhat scary change as it's the biggest ever change to how rpm lays down files on install.

Fixes: #967 (+ multiple reports over the years)

Details in commit messages, there are several preparatory commits leading up to the "big switch" to this new mode of operation. This change has also been the major blocker to making the plugin API public, as this can have severe consequences on plugins which only expect to handle files one by one. Our resident plugins are not actually affected though.

@pmatilai pmatilai added the bug label Feb 10, 2021
@pmatilai
Copy link
Member Author

Oh and FWIW, this certainly deserves further test-cases but don't let that stop reviewing...

@Conan-Kudo
Copy link
Member

Do you have any suggested test patterns beyond just "replace regular RPM and do standard DNF stuff"?

@pmatilai
Copy link
Member Author

Of particular interest are various failure scenarios, whether due to malformed payload (which you have to force with --noverify) or fs-level issues such as immutable files preventing operation, running out of disk-space etc. Two things to watch out for: that no temporary files are left behind, and that no permanent files get accidentally removed.

@ffesti
Copy link
Contributor

ffesti commented Feb 12, 2021

Minor nit pick: The name and usage of the new stage member in the last patch isn't as obvious at probably should be. Also the numbers used are nicely unrelated to the stages listed in the commit message. May be having named constant instead of random numbers makes this more readable.

@ffesti
Copy link
Contributor

ffesti commented Feb 12, 2021

Do we really want to call rpmpluginsCallFsmFilePost() for all files in case of failure? Or should be not call it at all in this case.

@pmatilai
Copy link
Member Author

pmatilai commented Feb 12, 2021

Yup, the file plugin calls are buggy here.
The common theme with our plugins is that pre- and post-slots get called in pairs: if the first is called then the latter is always called too. The code was quite careful to follow this, but now this is all so very different now - for one, the last loop runs in reverse order. Okay, it could and probably should look at the stage (and yeah the stage thing is ... vague atm) to determine if pre was called and only call post for those. Then again, I wonder if the pre-post pairing makes sense anymore at all. Especially since I don't think any plugin actually uses those slots...

@pmatilai
Copy link
Member Author

File-post slot calls fixed to be called only on files we actually processed, which is in line with the former "in pairs" policy. Also added names to the "stages", these don't seem particularly meaningful at this stage (pun intended or not) of development but certainly no worse than the abstract numbers either.

I'm really tempted to move the pre- and post-slots around fsmCommit(), but this would mean those would be called after creation for directories 😒

@pmatilai
Copy link
Member Author

Added some tests to demonstrate our new clean-up capabilities.

Handle rpmfiNext() in the while-condition directly to make it more like
similar other constructs elsewhere, adjust for the end of iteration
code after the loop. Also take the file index from rpmfiNext() so
we don't need multiple calls to rpmfiFX() later.
Collect the common state info into a struct shared by both file install
and remove, update code accordingly. The change looks much more drastic
than it is - it's just adding fp-> prefix to a lot of places.
While we're at it, remember the state data throughout the operation.

No functional changes here, just paving way for the next steps which
will look clearer with these pre-requisites in place.
Move setmeta into the struct too (we'll want this later anyhow),
and now we only need to pass the struct to fsmMkfile(). One less
argument to pass around, it has way too many still.

No functional changes.
We only use a temporary suffix for regular files that we are actually
creating, skipped and touched files should not have it. Add XFA_CREATING()
macro to accomppany XFA_SKIPPING() to easily check whether the file
is being created or something else.

No functional changes but makes the logic clearer.
We only use a temporary suffix for regular files that we are actually
creating, skipped and touched files should not have it. Add XFA_CREATING()
macro to accomppany XFA_SKIPPING() to easily check whether the file
is being created or something else.

No functional changes but makes the logic clearer.
No functional changes, just makes it a little cleaner as firstlink now
points to the actual file data instead of a index number somewhere.
Run the file installation in multiple stages:
1) gather intel
2) unpack the archive to temporary files
3) set file metadatas
4) commit files to final location
5) mop up leftovers on failure

This means we no longer leave behind a trail of untracked, potentially
harmful junk on installation failure.

If commit step fails the package can still be left in an inconsistent stage,
this could be further improved by first backing up old files to temporary
location to allow undo on failure, but leaving that for some other day.
Also unowned directories will still be left behind.

And yes, this is a somewhat scary change as it's the biggest ever change
to how rpm lays down files on install. Adopt the hardlink test spec
over to install tests and add some more tests for the new behavior.

Fixes: rpm-software-management#967 (+ multiple reports over the years)
@pmatilai
Copy link
Member Author

Rebased and fixups squashed.

@ffesti
Copy link
Contributor

ffesti commented Feb 16, 2021

My concerns have been addressed. Looks really good now!

@pmatilai
Copy link
Member Author

Okay, lets move on then, I've a further pile building on top of this already. Thanks for reviewing.

@pmatilai pmatilai merged commit a82251b into rpm-software-management:master Feb 16, 2021
@pmatilai pmatilai deleted the fsm-pr branch February 19, 2021 12:58
@chantra chantra mentioned this pull request Feb 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Rpm leaves untracked files behind on install failure
3 participants