Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Response to backdoor incident #103

Closed
thesamesam opened this issue Apr 9, 2024 · 41 comments
Closed

Response to backdoor incident #103

thesamesam opened this issue Apr 9, 2024 · 41 comments

Comments

@thesamesam
Copy link
Member

A backdoor was discovered in xz-5.6.0 and xz-5.6.1 by Andres Freund.

Updates are being posted at https://tukaani.org/xz-backdoor/ and the links at the bottom of the page.

Please be patient while everything is cleaned up. It will take time. In particular, please avoid speculation here which would be better placed in other venues.

@Larhzu Larhzu pinned this issue Apr 9, 2024
@gbraad
Copy link

gbraad commented Apr 10, 2024

thanks was added in db4dd74 and removal/reponse in e93e13c

@Paladynee
Copy link

Paladynee commented Apr 10, 2024

Larhzu referenced this issue Apr 12, 2024
While the backdoor was inactive (and thus harmless) without inserting
a small trigger code into the build system when the source package was
created, it's good to remove this anyway:

  - The executable payloads were embedded as binary blobs in
    the test files. This was a blatant violation of the
    Debian Free Software Guidelines.

  - On machines that see lots bots poking at the SSH port, the backdoor
    noticeably increased CPU load, resulting in degraded user experience
    and thus overwhelmingly negative user feedback.

  - The maintainer who added the backdoor has disappeared.

  - Backdoors are bad for security.

This reverts the following without making any other changes:

6e63681 Tests: Update two test files.
a3a29bb Tests: Test --single-stream can decompress bad-3-corrupt_lzma2.xz.
0b4ccc9 Tests: Update RISC-V test files.
8c9b8b2 liblzma: Fix typos in crc32_fast.c and crc64_fast.c.
82ecc53 liblzma: Fix false Valgrind error report with GCC.
cf44e4b Tests: Add a few test files.
3060e10 Tests: Use smaller dictionary size in RISC-V test files.
e2870db Tests: Add two RISC-V Filter test files.

The RISC-V test files also have real content that tests the filter
but the real content would fit into much smaller files. A generator
program would need to be available as well.

Thanks to Andres Freund for finding and reporting it and making
it public quickly so others could act without a delay.
See: https://www.openwall.com/lists/oss-security/2024/03/29/4
@cybik
Copy link

cybik commented Apr 12, 2024

Copying my comment over on the commit: e93e13c#commitcomment-140893111

For the record, while I applaud everyone keeping a cool head and responding to this issue promptly,
the fact of the matter is that these files still "exist" within the history of this repository.

At the risk of breaking build automata everywhere, may I suggest abusing git filter-branch to purge
the backdoor payloads from this repository permanently?

The backdoor files are harmless without inserting a trigger code when a release tarball is created.

@Larhzu except someone could fork this repository, add a new feature on top of xz, package it as "something else entirely", get it added to distributions with enough of a refined social attack, and cut a version with the backdoor resurrected. At that point, the attack vector would be back into play.

Or someone could just copy the attack vector into a completely different package altogether and adapt it somehow.

I get what you're saying, it's probably overkill, I'm probably paranoid. At the same time, is it really paranoia if I can already imagine ways that this attack vector could be resurrected without much effort? If I, a random user of the package, can just ramble about these, I'm unfortunately assuming the original attackers can already think of far worse.

@Larhzu
Copy link
Member

Larhzu commented Apr 12, 2024

So someone could fork the repository, re-apply the reverted backdoor commits, and add a trigger back too? Yes. But one could just add fresh backdoor files too, or copy the old ones from somewhere else. The other copies won't disappear even if 200 commits are rebased here.

It's not paranoia if they really are after you. Or after the project you maintain. ;-)

A more real reason to do the rebase is to avoid the bad files causing trouble with antivirus software. I don't know how big problem that is though.

@cybik
Copy link

cybik commented Apr 12, 2024

Fair enough!

At the very least your repository would have less chances of being pointed at :P

@anarazel
Copy link

So someone could fork the repository, re-apply the reverted backdoor commits, and add a trigger back too? Yes. But one could just add fresh backdoor files too, or copy the old ones from somewhere else. The other copies won't disappear even if 200 commits are rebased here.

I agree that rebasing/filtering doesn't seem worth the cost. However, it may be worth to remove the backdoored release tags and add them back under a different name, to avoid tooling from automatically using them. I'd go for something like backdoored-vX.Y.Z, so they're clearly in a different namespace.

@borestad
Copy link

borestad commented Apr 15, 2024

Just my 5 cents ....but AFAIK tools like bfg does use a mix of old sha's and new sha's.

This makes it possible to delete/clean only the effected commits and keep all original sha's

https://stackoverflow.com/a/49866781

The BFG avoids all this by rewriting the original references without keeping backups in refs/original/ (and is of course much faster and more convenient than git filter-branch). Nonetheless, it's still effectively copying all the original commits to new ones. Your copied repository is, in effect, a mostly-new repository, which should never be mixed with the old one.

@eli-schwartz
Copy link

eli-schwartz commented Apr 15, 2024

Just my 5 cents ....but AFAIK tools like bfg does use a mix of old sha's and new sha's.

This makes it possible to delete/clean only the effected commits and keep all original sha's

That is not how BFG works, and even your quote is actually saying that BFG avoids making your repository twice as big, because it deletes the old history instead of keeping it around as a backup branch.

git is an immutable append-only tree. A newer commit cannot keep the same sha1 unless it is based on the original commit containing the backdoor. This is an extremely important core security guarantee of git, with the unfortunate side effect that you cannot forge a fake commit that claims to be a newer commit but is actually a rebased one, even if you're the legal owner of the repository and you really have a good reason to want to do so.

@mathstuf
Copy link

If history is rewritten1, one can use git-replace to map old hashes to new hashes.

Footnotes

  1. if my vote counts for anything, I don't think it should be; removing all of this from the public record makes it harder to discuss as a historical incident in the future.

@xodiumluma
Copy link

xodiumluma commented Apr 24, 2024

I noticed that there were no code reviews. Jia Tan just committed directly. Is this a weakness...? 🤔

@thesamesam
Copy link
Member Author

thesamesam commented Apr 24, 2024

Reviews can be (and were) done in places other than github PRs.

@xodiumluma
Copy link

Thanks @thesamesam. Most of the other projects have GitHub-visible PRs with practices like signed commits. I think it's nice at least from a visibility perspective. Is this something we can move towards? Or is there somewhere else where we already have this in place?

@Larhzu
Copy link
Member

Larhzu commented Apr 24, 2024 via email

@xodiumluma
Copy link

Thank you so much @Larhzu - really appreciate your leadership and reply.

@eli-schwartz
Copy link

Thanks @thesamesam. Most of the other projects have GitHub-visible PRs with practices like signed commits. I think it's nice at least from a visibility perspective. Is this something we can move towards? Or is there somewhere else where we already have this in place?

Note that signed commits are largely incompatible with GitHub PRs, since GitHub PRs imply merging the PRs on the GitHub web UI and that in turn means that you cannot sign the commits you merged. It's possible as a project maintainer to submit a PR and then merge it via fast-forward on the command line, in which case the same commit sha1 is present both in the PR and in the repository's master branch, and GitHub will also mark the PR as merged.

https://github.com/orgs/community/discussions/6414

@mathstuf
Copy link

You can also merge locally, sign the merge commit, and push that. The PR's HEAD commit being reachable from its target branch will mark it as merged (squash merging is out-of-luck though).

@eli-schwartz
Copy link

Yeah, that's a subcategory of the case I mentioned. Unfortunately that does require using merge commits which not everyone wants to be limited to... The critical bit here is you're constrained to merging in the exact commit sha1 from the PR branch.

Many people, myself included, prefer to rebase merge (or cherry-pick merge, if you prefer to think of it that way) and consider squash merging to be a horrifying sin. :p

Maybe someday GitHub will implement the feedback in question, though I wouldn't hold my breath for it.

@mathstuf
Copy link

GitLab has a way to force "allow maintainer to push updates" to allow for rebase-merge to be done without PR author participation. Not sure if Github does (unlikely given the general disdain for force-push in the general Github workflow).

FWIW, I also agree with you on squash-merge being a sin, but some projects just cannot seem to have good commit hygiene from contributors and/or are unwilling to require rebasing fixes into introducing commits and just throw in the towel with squash-merges-by-default. Rebase-ff-merge is…OK, but I like having the merge commit as a place for topic-level metadata such as "who merged it", an easy handle to revert as a unit, and places for Reviewed-by-like trailers, PR information, etc.

@coderhisham
Copy link

Please Remove the releases by Jia Tan. and create a new release

@JohnDoe1001
Copy link

Have you considered reverting all changes yet? If so, then ty for your effort. I hope that the Linux community become stronger and more secure, especially against doomsday vulnerabilities like this.

Btw could I become a document maintainer?

@Larhzu
Copy link
Member

Larhzu commented May 1, 2024

Rebasing

The repository will not be rebased. The malicious files have been reverted. Even if one checks them out, they are harmless without the trigger code that was never committed to the Git repository. Thus things like git bisect are safe to use.

Merges

I don't do merges via GitHub UI. I do merges locally as then the same process works no matter where the Git repository is hosted. The commit 97f0ee0 I merged locally without rebasing, thus it shows GitHub as the committer. I might try to avoid this in the future.

If I understand it correctly, signing commits without merge commits doesn't sound impractical: Rebase the new commits locally first. Adding Closes: <URL-to-PR> to a commit message will close the PR and, even better, this leaves a link to the commit log that points to the PR discussion. Pushing to the submitter's branch (if allowed) before merge could make sense too but I don't know how much it matters if GitHub doesn't show the PR as merged. The discussion should make things clear anyway.

Force pushes

@ItzSwirlz You commented in PR #95 about force pushes.

I manually mirror to git.tukaani.org where I have never force pushed. Apart from the very early days on GitHub, when the primary repository was still on git.tukaani.org and the repository on GitHub wasn't officially announced anywhere yet, there were no force pushes to the master or v5.x branches on GitHub, or at least I didn't detect any. If they happened, they must have happened so that I never fetched the thrown-away commits.

To the xz-java repository on GitHub I had to force push after my account was reinstated because I didn't have a local copy of Jia's last commits (that created SECURITY.md). I had already pushed new commits to git.tukaani.org while the repository on GitHub was unavailable to me.

Spam problem

@JohnDoe1001: This kind of behavior isn't acceptable and may earn you a ban instead of maintainer status.

The amount of spam on GitHub is distracting from useful activities. There are many places where people can make humor-only posts about the recent events. Everyone, please keep those posts out of GitHub (assuming that you aren't in the enemy team). Thanks!

@ItzSwirlz
Copy link

ItzSwirlz commented May 1, 2024

@Larhzu Just a protip: You can merge the commit and then add Co-authored-by which will show more than one person committed something on GitHub. So if a commit is made, say 'Merge branch "foo" from "source"', edit it to say in the last line:

Co-authored-by: Lasse Collin <emailaddress>

(link to GitHub documentation)

And I may be wrong, but I think if you add your GPG key to GitHub, then make a signed commit with -S, even with the mirror as long as GitHub recognizes your commit is signed by your key it will show it as Verified. (github documentation on 'verified' commits)

I hope this is helpful :)

@Larhzu
Copy link
Member

Larhzu commented May 1, 2024

Co-authors

Linux uses Co-developed-by:. It sounds similar to Co-authored-by:, not sure if exactly identical though.

If I edit a commit, I often feel I should mention it because it's possible that I made an error and it would be wrong to silently put such errors to someone else's name. Using co-something-by in this case feels slightly odd if the main work was almost solely by the other person. It's just that it's good to have something in the commit message to indicate that others shouldn't so easily blame the commit author for errors in the commit.

It has also been suggested that one could commit the original thing first and then make a new commit to fix it or clean it up. Perhaps it's OK at least if the original commit isn't truly broken. In some cases this makes commit log harder to read though.

So the problem, if there is any, is how to be polite to the contributors so that they feel they were welcome.

Years ago I used to take patches, edit them, thank the author in the commit message (including if it was a patch etc.), and commit it in my name. In GitHub times it seems that to many it's very valuable to get their name as an author in the commit log as that will then show up on GitHub stats.

With tiny things it often would be faster for everyone to just create an issue (or send an email) and let maintainer(s) make the change but then the reporter wouldn't get their name in the commit log in the preferred way. Thus nowadays I try to commit things so that contributors get their name as a commit author and not merely mentioned in the commit message and in the file THANKS.

Signed commits and tags

I recently added my key to GitHub as instructed by someone on IRC. This made the tags signed by me appear as Verified. To my understanding, it would work the same with signed commits.

Mirroring the repository to multiple places is unrelated to the signatures. Storing signatures is a feature of Git.

@eli-schwartz
Copy link

Linux uses Co-developed-by:. It sounds similar to Co-authored-by:, not sure if exactly identical though.

I thought they used co-authored-by. The git project does: https://git-scm.com/docs/SubmittingPatches

The kernel project seems to use rather a lot of both styles. Github AFAIK only respects the git project's trailer, not the kernel project's trailer.

@Larhzu
Copy link
Member

Larhzu commented May 1, 2024

Co-developed-by is in the Linux docs. In recent years it seems to be 50 times more common than Co-authored-by. Perhaps Linux usage is old and thus a bit specific to that project.

Github AFAIK only respects the git project's trailer, not the kernel project's trailer.

That's how I understood it as well.

@Larhzu
Copy link
Member

Larhzu commented May 1, 2024

Questions

Tags signed by Jia

What to do with the v5.2.x and v5.4.x tags that are signed by Jia?

  1. Replace them with my signature on the same commits?

  2. Add a new commit (possibly even an empty one) to the v5.2 and v5.4 branches and sign that to indicate that I'm fine with the content?

  3. Do nothing?

What to do with the v5.6.x tags? Without the trigger code the backdoor files are harmless. However, I think it would be odd for me to sign those tags still. So I currently think I should leave them as is. I don't see a reason to delete the v5.6 branch.

Tarballs by Jia

Would a re-release of 5.2.12 and 5.4.6 be useful? Those likely are the last versions of the v5.2 and v5.4 branches.

New tarballs would be generated with newer Autotools and be signed by me. The filename would need to differ to avoid confusion. For example, xz-5.4.6-rerelease.tar.gz.

Should release files created by Jia be removed from official locations? Some of those I reviewed with diff -Narup against my release tarball candidates back in the day (it was part of the process to teach him to do the packages well). I currently think it's unlikely that those contain anything bad. However, if those are kept available then the public will (rightfully) think that I have approved them even if I never signed them. Thus I would need to review the old packages to some extent in the near future.

Miscellaneous

This thread has wishes and suggestions around these questions already but not much explanation why those wishes are being made. Thus, I would like to hear a sentence or two why one choice is preferred over another.

To those without a GitHub account, replies via IRC or email are fine too.

@Neustradamus
Copy link

Dear @Larhzu,

Several people would like that you do new releases (to remove the username of the backdoor author) and good asset files:

And maybe good to remove the 5.6.x tags too?

Note: Same in primary git source place.

Thanks in advance.

@Larhzu
Copy link
Member

Larhzu commented May 2, 2024

@Neustradamus:

There are good reasons to make new releases. Just to remove a name isn't one though.

You didn't provide even a single reason why tags should be removed. My message in this thread specifically asked for reasoning about tags and releases.

I'm more than aware that people want progress. Your messages don't help. Instead, they waste my and others' time when we are busy already. Thus, each post you make is delaying the new releases that you are requesting.

@Neustradamus
Copy link

Neustradamus commented May 3, 2024

@Larhzu: Thanks for your answer :)

To remove and recreate release tags (and add good files), it is to have not bad point with the backdoor author, it is better to have from you but of course it is not only a good reason.

About tags, it is to do not permit to download 5.6.0 and 5.6.1 (and 5.5.x too) for security:

@thesamesam
Copy link
Member Author

We believe at this point that tags are fine / inert by themselves and we wouldn't want to destroy them, but we may rename/retag them with Lasse's signature, as discussed above.

@Disservin
Copy link

Disservin commented May 6, 2024

Linux uses Co-developed-by:. It sounds similar to Co-authored-by:, not sure if exactly identical though.

If I edit a commit, I often feel I should mention it because it's possible that I made an error and it would be wrong to silently put such errors to someone else's name. Using co-something-by in this case feels slightly odd if the main work was almost solely by the other person. It's just that it's good to have something in the commit message to indicate that others shouldn't so easily blame the commit author for errors in the commit.

It has also been suggested that one could commit the original thing first and then make a new commit to fix it or clean it up. Perhaps it's OK at least if the original commit isn't truly broken. In some cases this makes commit log harder to read though.

So the problem, if there is any, is how to be polite to the contributors so that they feel they were welcome.

Years ago I used to take patches, edit them, thank the author in the commit message (including if it was a patch etc.), and commit it in my name. In GitHub times it seems that to many it's very valuable to get their name as an author in the commit log as that will then show up on GitHub stats.

With tiny things it often would be faster for everyone to just create an issue (or send an email) and let maintainer(s) make the change but then the reporter wouldn't get their name in the commit log in the preferred way. Thus nowadays I try to commit things so that contributors get their name as a commit author and not merely mentioned in the commit message and in the file THANKS.

For some context we (I am one of the maintainers) Stockfish, a chess engine, have been merging patches now almost completely locally because there are additional things we need to take care of while merging, we often manually "reword" the commit description and add a "closes" section with a link to the Github PR. Since most of our contributors don't write the description in the commit message but instead in the Github PR body. Somtimes patches also have to be reformatted but these changes are made within in the same commit in our case.

We also require a linear commit history and force pushes are generally disabled, when merging a patch a locally the author name is in the git commit log visable as the author and we (the maintainers) as the committers. Contributors generally also appear in the "Contributors" on Github. Their Github profile shows

Created 1 commit in 1 repository

The only minor downside (from a contributors perspective) is that GitHub shows the PR as closed and not merged :/

so afterall our contributors are generally fine with how things are handled and if you prefer the local way to do things there is nothing wrong with that

@mathstuf
Copy link

mathstuf commented May 6, 2024

The only minor downside is that GitHub shows the PR as closed and not merged :/

There is the "allow maintainers to push" option where you could push your to-be-merged edits to the source branch before merging so that the state tracking is accurate.

@Larhzu
Copy link
Member

Larhzu commented May 8, 2024

Thanks @Disservin! Since a major project is doing both edits to commit messages and reformatting the patches, I feel better doing it too, combining it with Co-authored-by: to mark that new errors might have been introduced by me.

I can try to remember to force-push to the PR branch (when it has been allowed) to get the PR marked as merged. But I won't consider that essential. Editing a commit message so that there is a link to the PR discussion seems valuable.

@Disservin
Copy link

@Larhzu no problem :D

I can try to remember to force-push to the PR branch (when it has been allowed) to get the PR marked as merged.

To my knowledge there is no there way to get it marked as merged when not merging over github directly. See here for example official-stockfish/Stockfish#5217 which is displayed as closed even though we merged it. We manually (or try to) apply a "to be merged" label to all merged patches.

@thesamesam
Copy link
Member Author

I think you can per #103 (comment).

@Larhzu
Copy link
Member

Larhzu commented May 8, 2024

#108 I merged locally and it shows up as merged on GH. However, it went as fast-forward without force-pushing to submitter's repo first so it's not the typical case. And it shows GitHub as the committer which I likely wish to avoid in the future.

@JohnDoe1001
Copy link

@Larhzu Sorry for the nagging. I guess we should just leave the issue to you guys. Good luck to y'all!

@thesamesam
Copy link
Member Author

thesamesam commented May 29, 2024

Lasse has updated the website, refreshed https://tukaani.org/xz-backdoor/, and published 1.0 of the review notes at https://tukaani.org/xz-backdoor/review.html. The article (reflections, lessons, etc) will be written and published separately and isn't ready.

We aren't done with the response, obviously, but this marks a major milestone in the project's recovery, hence I feel like this is the right time to close this particular bug.

Fixed releases are now out (5.2.13, 5.4.7, 5.6.2). You can compare their contents by making your own distribution tarball using make mydist. Future improvements in minimising the diff there are planned too. Thank you everyone.

@Neustradamus
Copy link

@Larhzu: Thanks to have fixed the backdoor!
I have recently done announcements on my social networks and on HN, Reddit too.

A new PR in vcpkg is here:

cc: @anarazel.

@cybik
Copy link

cybik commented Jun 7, 2024

The evil has been defeated

Troll comment abusing a css exploit. Please report it to Github.

You can use mobile apps to get around the CSS glitch, or use ublock origin to force-block https://github.com/Roblox/t/assets/106361566/b3306f20-57e8-449d-95f7-0ec0597b4e7e in My Filters

@Neustradamus
Copy link

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

16 participants