Skip to content

Commit

Permalink
Tests: Update two test files.
Browse files Browse the repository at this point in the history
The original files were generated with random local to my machine.
To better reproduce these files in the future, a constant seed was used
to recreate these files.
  • Loading branch information
JiaT75 committed Mar 9, 2024
1 parent a3a29bb commit 6e63681
Show file tree
Hide file tree
Showing 2 changed files with 0 additions and 0 deletions.
Binary file modified tests/files/bad-3-corrupt_lzma2.xz
Binary file not shown.
Binary file modified tests/files/good-large_compressed.lzma
Binary file not shown.

86 comments on commit 6e63681

@Extravi
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks bad to me >:(

LGTM :3

@Safari77
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quite a dude!
This was created yesterday:
https://github.com/tukaani-project/xz-java/blob/master/.github/SECURITY.md
If you discover a security vulnerability in this project please report it privately. Do not disclose it as a public issue.

@P-EB
Copy link

@P-EB P-EB commented on 6e63681 Mar 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Purrrrrrfect

@noita-player
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the quick fix OP, installing right away.

@N-R-K
Copy link

@N-R-K N-R-K commented on 6e63681 Mar 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@mrkubax10
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely updating!

@presentfactory
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how do you use that to construct a known-bad lzma archive? we're talking about the kind of blobs that a fuzzer would find, that you'd commit to prevent regressions

Well in this case all you'd need to do is have a script generate a good archive and then in the same script modify some bytes to make it a bad archive, with comments documenting why you're modifying those bytes (which would be better to help people understand the tests anyways).

I do think in this project's case you could do most the testing without any binary blobs, but I also would not say this applies to everything generally. Some more complex applications may simply need binary stuff for the sake of time, like sure you can in theory generate image tests for an offline renderer from the renderer itself, but doing so for 100 images may take hours which is not really desirable. Of course in those cases the binary data isn't being extracted or anything...but you get the idea, sometimes using binary blobs makes sense for the sake of time or difficulty in synthesizing the test data.

@habnabit
Copy link

@habnabit habnabit commented on 6e63681 Mar 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how do you use that to construct a known-bad lzma archive? we're talking about the kind of blobs that a fuzzer would find, that you'd commit to prevent regressions

Well in this case all you'd need to do is have a script generate a good archive and then in the same script modify some bytes to make it a bad archive, with comments documenting why you're modifying those bytes (which would be better to help people understand the tests anyways).

as i said in another comment,

if you commit a script that generates a malicious blob, and the source gives a plausible explanation for why it isn't malicious, what have you gained there?

you're talking about a process improvement, sure, but "binary blobs are bad" is not exactly the lesson i think needs to be taken away here

see also: https://en.wikipedia.org/wiki/Underhanded_C_Contest

@presentfactory
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well I think the idea is more that it'd be hard to embed a binary payload for an attack in a script without being suspicious. In theory you should only need to modify a few bytes in an archive to cause the code to hit some problematic path a fuzzer found, if you're injecting a kilobyte of nonsense into it then it should raise some more questions. Idk how big this attack was specifically but given the analysis of it, it sounds like it has a fair bit of code to inject its payload and to compromise the RSA function it replaces.

But yes generally I wouldn't say "binary blob = bad", just perhaps developers of projects which can benefit from source-only test synthesis should do that for higher assurance.

@janso3
Copy link

@janso3 janso3 commented on 6e63681 Mar 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perfect for gorgeous security, can push asap.

@izmyname
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, why not. ' LiNeKs Is MoRe SeCuRe', eh?

@presentfactory
Copy link

@presentfactory presentfactory commented on 6e63681 Mar 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are conflating "binary blobs" with "proprietary software". Obviously proprietary software distributed solely as binary blobs is problematic as its source cannot be inspected. These are tests though and while yes it would be good to understand how they are generated (similar to the source of an executable) there are sometimes no feasible ways to synthesize the test data or to do so in a timely manner. For instance, consider training an AI, fundamentally this requires terabytes of "test data" in the form of images, it is totally impossible to synthesize these so any AI project even if open source must include such data.

Your lack of nuance here is the problem as I very clearly said it would be good idea for a project like this which can do source-driven test synthesis to do so, but this is not always possible.

@presentfactory
Copy link

@presentfactory presentfactory commented on 6e63681 Mar 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where did I even mention disassembly lol. Are you even reading anything or just inventing things to strawman me with in realtime?

@xyzeva
Copy link

@xyzeva xyzeva commented on 6e63681 Mar 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

have we considered the possibility of the random implementation just generating a malicious payload on accident with a random seed? 😀

@presentfactory
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@xyzeva Well yes, this is not the only part of the attack. Various other commits and changes to say the build script were done to get this to work. It's very obviously deliberate and likely something that the author had been planning for a fairly long time.

I saw a commit from them for instance referencing the ifunc stuff they used to do this attack here a year ago, so that makes me wonder if they were already planning to implement it this way back then, or if adding ifunc stuff to the project is what gave them the inspiration to. Either way it's not an accident.

@xyzeva
Copy link

@xyzeva xyzeva commented on 6e63681 Mar 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@xyzeva Well yes, this is not the only part of the attack. Various other commits and changes to say the build script were done to get this to work. It's very obviously deliberate and likely something that the author had been planning for a fairly long time.

I saw a commit from them for instance referencing the ifunc stuff they used to do this attack here a year ago, so that makes me wonder if they were already planning to implement it this way back then, or if adding ifunc stuff to the project is what gave them the inspiration to. Either way it's not an accident.

was a joke, shouldve indicated

@presentfactory
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I stopped reading

Yes not reading tends to make you look like an idiot. Glad to sort that out lol. Maybe try to like, read anything I said before inventing arguments that are not being made. You can't strawman your way to victory.

@emansom
Copy link

@emansom emansom commented on 6e63681 Mar 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This project needs to be quarantined, all commits made by @JiaT75 and other projects he contributed code with and to, to be considered backdoors and this project to be taken over by a trusted party. If next month a new release is made by @JiaT75 and all distributions packagers just go along with it like nothing happened: nothing was learned from this supply chain attack.

There's a chance he will likely force-push, corrupting the history of this git repository. So even the repository itself shouldn't be trusted. Retrieve backups from really really old build machines before he ever contributed if possible.

Yes, be that paranoid. If you don't think that's necessary you don't grasp the severity of what has been exposed today.

@presentfactory
Copy link

@presentfactory presentfactory commented on 6e63681 Mar 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any push against this is literally admission that you collaborated in this backdoor or are too stupid to have a meaningful opinion on software security and auditability.

You should try thinking in less absolutist terms lol. There is no such thing as absolutes when it comes to complex things like software development. You clearly have never written any meaningful software before in your life if you think literally everything can be generated from source code (or be done so in a timely/practical manner). But then again, you didn't even read what I said so you probably didn't consider this, given I wasn't even saying this is the case for software like this anyways.

@JeremyStarTM
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@presentfactory
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should glow harder.

I am fairly yellow so that's just normal for me don't worry.

@presentfactory
Copy link

@presentfactory presentfactory commented on 6e63681 Mar 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Turns out training AI with AI generated data is not the best thing to be doing if you want to improve its quality. Not that you'd understand something like that though.

Also what about testing on data collected from physical sensors that software is meant to analyze? What about data that takes hours to calculate (e.g. reference renders, as I mentioned earlier)?

@partev
Copy link

@partev partev commented on 6e63681 Mar 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And so it begins. Always knew one day a nightmare supply chain attack would originate from GitHub.

open source supply chain attacks on GitHub have been happening for a very long time. This is the first time they got caught.

@presentfactory
Copy link

@presentfactory presentfactory commented on 6e63681 Mar 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once again, stop strawmanning. This discussion is about the applicability of binary blobs for tests in software in general. You yourself understood this not but a few posts ago as you were talking about AI synthesized images and whatnot, so don't even act like suddenly I am specifically talking about this project as I already said that projects like this which can generate test data from readable source should probably do so (so your entire argument is as usual based on a strawman too).

Glad to know you finally agree though.

@presentfactory
Copy link

@presentfactory presentfactory commented on 6e63681 Mar 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you're the one strawmanning "in general" dipshit, this is xz, it's trivial.

Nope. You said the following particularly non-nuanced unqualified statements which are clearly talking about software in general not just this project:

Binary blobs are always bad

Everyone should actively avoid binary blobs and always complain when told to use them

Stay mad.

@greyltc
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, ship it! 🚢🚢🚢🚢

@ceticamarco
Copy link

@ceticamarco ceticamarco commented on 6e63681 Mar 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does anyone have any information concerning the threat actor such as his identity or country of origin?

His username was previously used on some Taiwanese forum back in the early 2000s but since this account is probably a burner, the real question is: what happened to the previous xz maintainer(Lasse Collin)? Recently, he also introduced Jia Tan to the Kernel mailing list.

@sdr495
Copy link

@sdr495 sdr495 commented on 6e63681 Mar 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does anyone have any information concerning the threat actor such as his identity or country of origin?

Cybersecurity company in a friendly middle eastern country. No need to worry about it unless your a terrorist.

@femboywiki
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@presentfactory I will simply bask in irony of federal agent being a horsefucker.

please step away from the keyboard, you've fed the troll long enough thank you.

@presentfactory
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not trolling thank you very much, everything I said is perfectly reasonable (rather than unreasonable statements that suggest any sort of binary blob in software development is somehow evil). But yes once you devolve to that sort of ad hominem it's probably best to stop lol, we've been at that point for a while now.

@femboywiki
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not trolling thank you very much, everything I said is perfectly reasonable (rather than unreasonable statements that suggest any sort of binary blob in software development is somehow evil). But yes once you devolve to that sort of ad hominem it's probably best to stop lol, we've been at that point for a while now.

Yes I am aware that your trolling is subtle. Bravo.

@habnabit
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not trolling thank you very much, everything I said is perfectly reasonable (rather than unreasonable statements that suggest any sort of binary blob in software development is somehow evil). But yes once you devolve to that sort of ad hominem it's probably best to stop lol, we've been at that point for a while now.

Yes I am aware that your trolling is subtle. Bravo.

oops @mrbid wrong account 🙂

@DanielRuf
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I doubt that the origin of a person is that relevant. I'm just here for the technical facts.

Did anyone already check the bash script from the oss-security post and ran it against some distros?

https://www.openwall.com/lists/oss-security/2024/03/29/4

@DanielRuf
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@femboywiki
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not trolling thank you very much, everything I said is perfectly reasonable (rather than unreasonable statements that suggest any sort of binary blob in software development is somehow evil). But yes once you devolve to that sort of ad hominem it's probably best to stop lol, we've been at that point for a while now.

Yes I am aware that your trolling is subtle. Bravo.

oops @mrbid wrong account 🙂

🤫

@ceticamarco
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I doubt that the origin of a person is that relevant. I'm just here for the technical facts.

It could clarify the type of organization behind this supply chain attack. Anyway, knowing what happened to the prior maintainer would be a good start, considering that he introduced this person to the Linux kernel mailing list as well.

@RosettaPwn
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sorry but the amount of people here that probably know nothing about security or are the typical low level security people that press buttons on their EDR and close tickets without understanding anything they're doing is hilarious. That's okay, keep me employed on the offense side though its great money 👍.

If you don't have the source for something, for example how and why a binary blob was created, then its pretty difficult to ascertain if it is malicious which becomes increasingly true the more complex the malicious code is... for example if they're aborting execution of malicious functions after doing checks for virtualization, cohabitation, and seeking specific targets doing environment finger printing.

@capicy is correct in pointing out the idiocy of accepting random binary blobs as safe without any vetting. It stands to reason that if you can't understand what you're looking at then how can you properly vet it as safe before pushing it out to millions of systems? Sounds like in that case literally no code reviewing is taking place.

Also if you have a my little pony avatar I'm just going to immediately assume you're both a troll and mentally ill so I suppose that excuses a good chunk of the wild comments here.

@DanielRuf
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@capicy not sure, the backdoor is just exploitable via sshd and seems to allow some unauthenticated access, according to information.

If you don't have sshd (the server part) running, it's probably not exploitable.

https://linux.die.net/man/8/sshd

ssh and sshd are two different parts.

It could clarify the type of organization behind this supply chain attack. Anyway, knowing what happened to the prior maintainer would be a good start, considering that he introduced this person to the Linux kernel mailing list as well.

I see, people are jumping to geopolitical reasons. But that can be also a false flag. We don't know this maintainer personally, but assumptions don't help imho.

@gayalien
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sorry but the amount of people here that probably know nothing about security or are the typical low level security people that press buttons on their EDR and close tickets without understanding anything they're doing is hilarious. That's okay, keep me employed on the offense side though its great money 👍.

If you don't have the source for something, for example how and why a binary blob was created, then its pretty difficult to ascertain if it is malicious which becomes increasingly true the more complex the malicious code is... for example if they're aborting execution of malicious functions after doing checks for virtualization, cohabitation, and seeking specific targets doing environment finger printing.

@capicy is correct in pointing out the idiocy of accepting random binary blobs as safe without any vetting. It stands to reason that if you can't understand what you're looking at then how can you properly vet it as safe before pushing it out to millions of systems? Sounds like in that case literally no code reviewing is taking place.

Also if you have a my little pony avatar I'm just going to immediately assume you're both a troll and mentally ill so I suppose that excuses a good chunk of the wild comments here.

ok "RosettaPwn" tdlr

if you have a my little pony avatar I'm just going to immediately assume you're both a troll and mentally ill

erm, discrimination much? all Bronies are now automatically mentally ill now because you say so..?

@jtbx
Copy link

@jtbx jtbx commented on 6e63681 Mar 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ceticamarco https://www.mail-archive.com/xz-devel@tukaani.org/msg00566.html is an interesting find. Overly agressive push to get a different maintainer and it just so happens to be on repo that got backdoored after the fact.

Yeah this is looking really suspicious, thanks for pointing this out.

I haven't lost interest but my ability to care has been fairly limited
mostly due to longterm mental health issues but also due to some other
things. Recently I've worked off-list a bit with Jia Tan on XZ Utils and
perhaps he will have a bigger role in the future, we'll see.

@presentfactory
Copy link

@presentfactory presentfactory commented on 6e63681 Mar 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also if you have a my little pony avatar I'm just going to immediately assume you're both a troll and mentally ill so I suppose that excuses a good chunk of the wild comments here.

Nice ad hominem, maybe you could learn a thing or two about security if you didn't disregard people's opinions based on if they are a furry or not.

And again, perhaps stop strawmanning me, I said that source-driven testing is good and viable in cases like this. In fact I was advocating for this project doing so as it's totally doable for a tool which works off of slightly corrupted compressed data, and perhaps expected for a package that is so core to many other pieces of software.

I however reject idiotic absolutist perspectives by people who have never developed software before acting like there is no justifiable reason to use a binary blob in an application's testing (it is not ideal yes, but sometimes it has to be done). I provided many examples of where such things may be necessary either due to impossibility in synthesizing the data or impracticality due to time constraints, feel free to address them with ways to avoid such things rather than just pretending you know what you're talking about.

@BanementI
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🥚

@Whanos
Copy link

@Whanos Whanos commented on 6e63681 Mar 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#goodfirstissue

@stikonas
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@xyzeva chance is low, and if you used scripts to generate bad archives reproducibly, you wouldn't use random seed and a random program, you'd document which byte triggers what and then generate binary file using a script that's readable, preferably by a toddler.

Any push against this is literally admission that you collaborated in this backdoor or are too stupid to have a meaningful opinion on software security and auditability.

@presentfactory your babble is as readable as files in these commits.

In fact not just the blobs but huge unreadable autogenerated scripts should also not be considered source, e.g. who would read though 500 KiB of configure scripts. Everybody should just run autoreconf on them (though that wouldn't have helped in this particular bug I guess since the backdoor was in m4 file).

@mgalgs
Copy link

@mgalgs mgalgs commented on 6e63681 Mar 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Scrolling through some of these commits is kinda scaring me... Was @JiaT75's account taken over by the malicious actors? Or did they really contribute to xz for two years in order to build trust so that they could then sneak in the exploit? If it was the latter then there should be some major alarm bells going off right now around the open source community... I still believe that open source is inherently more secure than closed source, but we have to stay on our toes...

Save us, Open Source Security Foundation, you're our only hope!

https://openssf.org/

@presentfactory
Copy link

@presentfactory presentfactory commented on 6e63681 Mar 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mgalgs It's hard to say for sure given it is possible their account was only recently compromised or something and all the recent activity has not been them. Assuming it was them though the whole time I'd imagine either they joined the project with intent to do this later down the line, or more likely is that they either saw some money to be made by developing an exploit or were contacted by say a government entity or other group which proposed something like this to them (perhaps also in exchange for money). Maybe even they just did this for themselves, seeing some company or piece of software they wanted to compromise and used their own position to do so.

We'll probably never know for sure unless some actual law enforcement investigation is done on it...but I doubt that's going to happen as I do not know if this is even illegal. Hacking in some contexts is, but I am not sure if developing your own software maliciously is, especially if it never actually has any damaging effect on say a company due to being caught early like this.

@jocxfin
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@DannyDaemonic
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@presentfactory I was almost responded with the same thing. I'm surprised this wouldn't be illegal though. If this isn't, the laws need to be changed. This clearly shows intent.

@DanielRuf
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mgalgs that doesn't help much if anyone who you trust goes rogue.

It may have been the planned from the beginning (see also the first link at https://boehs.org/node/everything-i-know-about-the-xz-backdoor, @presentfactory you might want to take a look at it too.) but still, people are always the weakest links and social engineering works in many cases.

@DanielRuf
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@DannyDaemonic if something is illegal or not, it won't stop bad actors like black hats, APTs (contracted by governments for example) and other players on the digital battlefield.

Laws don't interest such groups.

@DannyDaemonic
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@DanielRuf I'm certainly not claiming it would have prevented this, but, I mean, we can all agree it should be illegal, right?

Also, I don't think we know what happened yet. There's still a chance the guy just got bored.

@DanielRuf
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I don't think we know what happened yet. There's still a chance the guy just got bored.

For two years? I doubt that.

https://gist.github.com/thesamesam/223949d5a074ebc3dce9ee78baad9e27?permalink_comment_id=5005680#gistcomment-5005680

Laws don't affect people without ethics. That's well known in the hacker world.

@mrbid
Copy link

@mrbid mrbid commented on 6e63681 Mar 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I don't think we know what happened yet. There's still a chance the guy just got bored.

For two years? I doubt that.

https://gist.github.com/thesamesam/223949d5a074ebc3dce9ee78baad9e27?permalink_comment_id=5005680#gistcomment-5005680

Laws don't affect people without ethics. That's well known in the hacker world.

In a world ever increasingly lacking good role models what are we to expect, what has happened here is both sensational and insidious and yet my heart extends to this threat actor in the hope that he finds a more fulfilling path and better place in life.

@DannyDaemonic

This comment was marked as off-topic.

@DanielRuf

This comment was marked as off-topic.

@geoff-m

This comment was marked as spam.

@noita-player

This comment was marked as spam.

@arzlo

This comment was marked as spam.

@swonlee-13

This comment was marked as spam.

@Buggem

This comment was marked as spam.

Please sign in to comment.