Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace IO read on binary files with File binread #16325

Merged
merged 2 commits into from
Mar 24, 2022

Conversation

sjanusz-r7
Copy link
Contributor

@sjanusz-r7 sjanusz-r7 commented Mar 10, 2022

This PR replaces IO.read with File.binread, in scenarios where it's obvious that we're reading from binaries, to prevent an issue where not all of the file has been read correctly due to an additional EOL<->CRLF conversion that happens on Windows. Specifying the mode on File.read or IO.read to include b would also fix the issue.

"b"  Binary file mode
     Suppresses EOL <-> CRLF conversion on Windows. And
     sets external encoding to ASCII-8BIT unless explicitly
     specified.

Example:

[6] pry(#<Msf::Modules::Exploit__Windows__Fileformat__Adobe_pdf_embedded_exe::MetasploitModule>)> IO.read("C:/metasploit-framework/data/exploits/CVE-2010-1240/template.pdf").length
=> 563
[7] pry(#<Msf::Modules::Exploit__Windows__Fileformat__Adobe_pdf_embedded_exe::MetasploitModule>)> IO.read("C:/metasploit-framework/data/exploits/CVE-2010-1240/template.pdf", mode: 'rb').length
=> 618
[8] pry(#<Msf::Modules::Exploit__Windows__Fileformat__Adobe_pdf_embedded_exe::MetasploitModule>)> File.binread("C:/metasploit-framework/data/exploits/CVE-2010-1240/template.pdf").length
=> 618

Relevant issue: #16285

Cross-referencing a PR that also takes care of file reads: #16174

Bug - Not reading full file contents

When trying to read a file without specifying the mode with File or IO:

irb(main):001:0> IO.read("ext_server_python.x64.debug.dll").length
=> 599
irb(main):002:0> IO.read("ext_server_python.x64.debug.dll", mode: 'rb').length
=> 7086592
irb(main):004:0> File.binread("ext_server_python.x64.debug.dll").length
=> 7086592
irb(main):005:0> File.read("ext_server_python.x64.debug.dll").length
=> 599
irb(main):006:0> File.read("ext_server_python.x64.debug.dll", mode: 'rb').length
=> 7086592

Verification

  • Start msfconsole
  • use exploit/windows/fileformat/adobe_pdf_embedded_exe
  • set exename {...}
  • Try to run
  • Confirm the run command fails without these changes.
  • Verify this PR fixes the issue and that the PDF is generated correctly

Before (using exploit/windows/fileformat/adobe_pdf_embedded_exe)

msf6 exploit(windows/fileformat/adobe_pdf_embedded_exe) > run

[*] Reading in 'C:/metasploit-framework/data/exploits/CVE-2010-1240/template.pdf'...
[*] Parsing 'C:/metasploit-framework/data/exploits/CVE-2010-1240/template.pdf'...
[-] Sorry, I'm picky. Incompatible PDF structure, please try a different PDF template.

After (using exploit/windows/fileformat/adobe_pdf_embedded_exe)

msf6 exploit(windows/fileformat/adobe_pdf_embedded_exe) > run

[*] Reading in 'C:/metasploit-framework/data/exploits/CVE-2010-1240/template.pdf'...
[*] Parsing 'C:/metasploit-framework/data/exploits/CVE-2010-1240/template.pdf'...
[*] Using './win10met.exe' as payload...
[+] Parsing Successful. Creating 'evil.pdf' file...
[+] evil.pdf stored at C:/Users/simon/.msf4/local/evil.pdf

Comment on lines -71 to -72
# bug fix for: data = ::IO.read(datastore['PEXEC'])
# the above does not return the entire contents
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👀 👀 👀

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lol. perhaps we should add a rubocop rule for this?

@adfoster-r7
Copy link
Contributor

Looks like the issue exists with File.read too, which I believe just delegates to IO.read under the hood:

> irb                                      
3.0.2 :001 > File.method(:read)
 => #<Method: File(IO).read(*)> 

As well as calls to File.write may be incorrectly written if the mode isn't correctly specified. We'll have to put that up as a separate effort though, as it's a more tangled thread to pull at than replacing these IO.read calls

@gwillcox-r7
Copy link
Contributor

Not 100% sure this fixes all the instances of this problem. Looking at https://github.com/rapid7/metasploit-framework/search?q=%22IO.read%22, I see a couple of cases where we have library functions that take in a parameter named io which is expected to be an IO object, and then they call .read() on that object.

Examples can be seen in lib/msf/core/exploit/remote/java/rmi/util.rb particularly at

# Extracts an string from an IO
#
# @param io [IO] the io to extract the string from
# @return [String, nil] the extracted string if success, nil otherwise
def extract_string(io)
raw_length = io.read(2)
unless raw_length && raw_length.length == 2
return nil
end
length = raw_length.unpack('s>')[0]
string = io.read(length)
unless string && string.length == length
return nil
end
string
end
# Extracts an int from an IO
#
# @param io [IO] the io to extract the int from
# @return [Integer, nil] the extracted int if success, nil otherwise
def extract_int(io)
int_raw = io.read(4)
unless int_raw && int_raw.length == 4
return nil
end
int = int_raw.unpack('l>')[0]
int
end
# Extracts a byte from an IO
#
# @param io [IO] the io to extract the byte from
# @return [Byte, nil] the extracted byte if success, nil otherwise
def extract_byte(io)
byte_raw = io.read(1)
unless byte_raw && byte_raw.length == 1
return nil
end
byte = byte_raw.unpack('C')[0]
byte
end
# Extracts a long from an IO
#
# @param io [IO] the io to extract the long from
# @return [Integer, nil] the extracted int if success, nil otherwise
def extract_long(io)
int_raw = io.read(8)
unless int_raw && int_raw.length == 8
return nil
end
int = int_raw.unpack('q>')[0]
int
end
# Extract an RMI interface reference from an IO
#
# @param io [IO] the io to extract the reference from, should contain the data
# inside a BlockData with the reference information.
# @return [Hash, nil] the extracted reference if success, nil otherwise
# @see Msf::Exploit::Remote::Java::Rmi::Client::Jmx:Server::Parser#parse_jmx_new_client_endpoint
# @see Msf::Exploit::Remote::Java::Rmi::Client::Registry::Parser#parse_registry_lookup_endpoint
def extract_reference(io)
ref = extract_string(io)
unless ref && (ref == 'UnicastRef' || ref == 'UnicastRef2')
return nil
end
if ref == 'UnicastRef2'
form = extract_byte(io)
unless form == 0 || form == 1 # FORMAT_HOST_PORT or FORMAT_HOST_PORT_FACTORY
return nil
end
end
address = extract_string(io)
return nil unless address
port = extract_int(io)
return nil unless port
object_number = extract_long(io)
uid = Rex::Proto::Rmi::Model::UniqueIdentifier.decode(io)
{address: address, port: port, object_number: object_number, uid: uid}
end
which shows several functions that all are noted in the comments as taking in an IO object and all of which call the .read() method on that object later on in the code.

To truely fix all instances of this issue I believe we may need to do another run through the code and take this into consideration. There also appears to be some cases where msf_io is used but if https://github.com/rapid7/metasploit-framework/blob/04e8752b9b74cbaad7cb0ea6129c90e3172580a2/spec/support/shared/contexts/msf/string_io.rb is to be believed I think this is using the StringIO class which I don't think is the same thing. Just something to be aware of as your searching 👍

@sjanusz-r7
Copy link
Contributor Author

Not 100% sure this fixes all the instances of this problem. Looking at https://github.com/rapid7/metasploit-framework/search?q=%22IO.read%22, I see a couple of cases where we have library functions that take in a parameter named io which is expected to be an IO object, and then they call .read() on that object.

Examples can be seen in lib/msf/core/exploit/remote/java/rmi/util.rb particularly at

# Extracts an string from an IO
#
# @param io [IO] the io to extract the string from
# @return [String, nil] the extracted string if success, nil otherwise
def extract_string(io)
raw_length = io.read(2)
unless raw_length && raw_length.length == 2
return nil
end
length = raw_length.unpack('s>')[0]
string = io.read(length)
unless string && string.length == length
return nil
end
string
end
# Extracts an int from an IO
#
# @param io [IO] the io to extract the int from
# @return [Integer, nil] the extracted int if success, nil otherwise
def extract_int(io)
int_raw = io.read(4)
unless int_raw && int_raw.length == 4
return nil
end
int = int_raw.unpack('l>')[0]
int
end
# Extracts a byte from an IO
#
# @param io [IO] the io to extract the byte from
# @return [Byte, nil] the extracted byte if success, nil otherwise
def extract_byte(io)
byte_raw = io.read(1)
unless byte_raw && byte_raw.length == 1
return nil
end
byte = byte_raw.unpack('C')[0]
byte
end
# Extracts a long from an IO
#
# @param io [IO] the io to extract the long from
# @return [Integer, nil] the extracted int if success, nil otherwise
def extract_long(io)
int_raw = io.read(8)
unless int_raw && int_raw.length == 8
return nil
end
int = int_raw.unpack('q>')[0]
int
end
# Extract an RMI interface reference from an IO
#
# @param io [IO] the io to extract the reference from, should contain the data
# inside a BlockData with the reference information.
# @return [Hash, nil] the extracted reference if success, nil otherwise
# @see Msf::Exploit::Remote::Java::Rmi::Client::Jmx:Server::Parser#parse_jmx_new_client_endpoint
# @see Msf::Exploit::Remote::Java::Rmi::Client::Registry::Parser#parse_registry_lookup_endpoint
def extract_reference(io)
ref = extract_string(io)
unless ref && (ref == 'UnicastRef' || ref == 'UnicastRef2')
return nil
end
if ref == 'UnicastRef2'
form = extract_byte(io)
unless form == 0 || form == 1 # FORMAT_HOST_PORT or FORMAT_HOST_PORT_FACTORY
return nil
end
end
address = extract_string(io)
return nil unless address
port = extract_int(io)
return nil unless port
object_number = extract_long(io)
uid = Rex::Proto::Rmi::Model::UniqueIdentifier.decode(io)
{address: address, port: port, object_number: object_number, uid: uid}
end

which shows several functions that all are noted in the comments as taking in an IO object and all of which call the .read() method on that object later on in the code.
To truely fix all instances of this issue I believe we may need to do another run through the code and take this into consideration. There also appears to be some cases where msf_io is used but if https://github.com/rapid7/metasploit-framework/blob/04e8752b9b74cbaad7cb0ea6129c90e3172580a2/spec/support/shared/contexts/msf/string_io.rb is to be believed I think this is using the StringIO class which I don't think is the same thing. Just something to be aware of as your searching 👍

After some digging through the code, it turns out that these calls are using StringIO as seen below, meaning they do not need to be changed:

return_io = StringIO.new(end_point_block_data.contents, 'rb')
reference = extract_reference(return_io)

return_io = StringIO.new(end_point_block_data.contents, 'rb')
reference = extract_reference(return_io)

@gwillcox-r7
Copy link
Contributor

It appears I haven't tried hard enough to break assumptions 😈 Lets see if I kind find any other potential edge cases, but nice job spotting that isn't an affected change 👍

@gwillcox-r7
Copy link
Contributor

Potentially missed files:
external/source/osx/x86/src/test/write_size_and_data.rb
lib/anemone/extractors/dirbuster.rb
modules/payloads/singles/cmd/windows/reverse_ruby.rb
modules/payloads/singles/cmd/windows/bind_ruby.rb
https://github.com/rapid7/metasploit-framework/blob/04e8752b9b74cbaad7cb0ea6129c90e3172580a2/modules/payloads/singles/cmd/unix/bind_ruby_ipv6.rb
https://github.com/rapid7/metasploit-framework/blob/04e8752b9b74cbaad7cb0ea6129c90e3172580a2/modules/payloads/singles/cmd/unix/reverse_ruby.rb
modules/post/multi/manage/system_session.rb

def check
test_string = Rex::Text.rand_text_alphanumeric(encoded_swf.length)
io = URI.parse(exploit_url(test_string)).open
if io.read.start_with? test_string
Msf::Exploit::CheckCode::Vulnerable
else
Msf::Exploit::CheckCode::Safe
end
end

def self.run_command(command, exception: true)
puts command
result = ''
::Open3.popen2e(
{ 'BUNDLE_GEMFILE' => File.join(Dir.pwd, 'Gemfile') },
'/bin/bash', '--login', '-c', command
) do |stdin, stdout_and_stderr, wait_thread|
stdin.close_write
while wait_thread.alive?
ready = IO.select([stdout_and_stderr], nil, nil, 1)
next unless ready
reads, _writes, _errors = ready
reads.to_a.each do |io|
data = io.read_nonblock(1024)
puts data
result += data
rescue EOFError, Errno::EAGAIN
# noop
end
end

@sjanusz-r7
Copy link
Contributor Author

sjanusz-r7 commented Mar 15, 2022

I think for the suggested files above, they don't need to be changed as they are either not interacting with the filesystem or reading from a process, therefore they aren't impacted. Correct me if I'm wrong. 💯

Copy link
Contributor

@gwillcox-r7 gwillcox-r7 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Proposed changes look good but will await updates to see if we need to add additional files.

@adfoster-r7 adfoster-r7 changed the title Replace IO read with File binread Replace IO read on binary files with File binread Mar 15, 2022
@sjanusz-r7 sjanusz-r7 force-pushed the replace-io-with-file branch 2 times, most recently from ebf7dc1 to 13c7989 Compare March 15, 2022 18:12
@adfoster-r7
Copy link
Contributor

Syncing up with @sjanusz-r7 - we're keeping the scope of this PR to fixing the uses of IO.read that are obviously reading binary files

@gwillcox-r7
Copy link
Contributor

Before the change on a Windows host (Linux hosts work fine):

C:\metasploit-framework\bin>msfconsole.bat
C:/metasploit-framework/embedded/lib/ruby/gems/3.0.0/gems/zeitwerk-2.5.4/lib/zeitwerk/kernel.rb:35: warning: Win32API is deprecated after Ruby 1.9.1; use fiddle directly instead
                                   ____________
 [%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%| $a,        |%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%]
 [%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%| $S`?a,     |%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%]
 [%%%%%%%%%%%%%%%%%%%%__%%%%%%%%%%|       `?a, |%%%%%%%%__%%%%%%%%%__%%__ %%%%]
 [% .--------..-----.|  |_ .---.-.|       .,a$%|.-----.|  |.-----.|__||  |_ %%]
 [% |        ||  -__||   _||  _  ||  ,,aS$""`  ||  _  ||  ||  _  ||  ||   _|%%]
 [% |__|__|__||_____||____||___._||%$P"`       ||   __||__||_____||__||____|%%]
 [%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%| `"a,       ||__|%%%%%%%%%%%%%%%%%%%%%%%%%%]
 [%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%|____`"a,$$__|%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%]
 [%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%        `"$   %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%]
 [%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%]


       =[ metasploit v6.1.34-dev-c0185f65bfb7cb6d280cab7c47a34310d0e2e390]
+ -- --=[ 2208 exploits - 1170 auxiliary - 395 post       ]
+ -- --=[ 615 payloads - 45 encoders - 11 nops            ]
+ -- --=[ 9 evasion                                       ]

Metasploit tip: Display the Framework log using the
log command, learn more with help log

msf6 > use exploit/windows/fileformat/adobe_pdf_embedded_exe
[*] No payload configured, defaulting to windows/meterpreter/reverse_tcp
msf6 exploit(windows/fileformat/adobe_pdf_embedded_exe) > set exename C:\Windows\System32\notepad.exe
exename => C:WindowsSystem32notepad.exe
msf6 exploit(windows/fileformat/adobe_pdf_embedded_exe) > set EXENAME "C:\Windows\System32\notepad.exe"
EXENAME => C:\Windows\System32\notepad.exe
msf6 exploit(windows/fileformat/adobe_pdf_embedded_exe) > run

[*] Reading in 'C:/metasploit-framework/embedded/framework/data/exploits/CVE-2010-1240/template.pdf'...
[*] Parsing 'C:/metasploit-framework/embedded/framework/data/exploits/CVE-2010-1240/template.pdf'...
[-] Sorry, I'm picky. Incompatible PDF structure, please try a different PDF template.
msf6 exploit(windows/fileformat/adobe_pdf_embedded_exe) >

@@ -117,7 +117,7 @@ def ef_payload(pdf_name,payload_exe,obj_num)
print_status("Using '#{datastore['EXENAME']}' as payload...")

file_size = File.size(payload_exe)
stream = Rex::Text.zlib_deflate(IO.read(payload_exe))
stream = Rex::Text.zlib_deflate(File.binread(payload_exe))
md5 = Rex::Text.md5(File.read(payload_exe))
Copy link
Contributor

@gwillcox-r7 gwillcox-r7 Mar 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to be failing for me which leads me to ask, should this not also be File.binread in the call to Rex::Text.md5?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good God, are they putting encoding on #read? i'm guessing there's a fair amount of that going on 😕

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably, I wouldn't be surprised if similar typos were made just cause people didn't know binread existed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Git blame will not be kind to me on this one. Since we embed binary encoding enforcement in the Ruby file level, i thought this was already addressed. Should we change all occurrences then? The encoding nightmares of 1.9 days are an old ghost - hoping there's a consistent way to do this all around (and should we update Rex underpinnings?).

@gwillcox-r7
Copy link
Contributor

gwillcox-r7 commented Mar 15, 2022

This is still failing for me:

msf6 exploit(windows/fileformat/adobe_pdf_embedded_exe) > reload
[*] Reloading module...
msf6 exploit(windows/fileformat/adobe_pdf_embedded_exe) > set EXENAME "C:\Windows\System32\notepad.exe"
EXENAME => C:\Windows\System32\notepad.exe
msf6 exploit(windows/fileformat/adobe_pdf_embedded_exe) > show options

Module options (exploit/windows/fileformat/adobe_pdf_embedded_exe):

   Name            Current Setting                                               Required  Description
   ----            ---------------                                               --------  -----------
   EXENAME         C:\Windows\System32\notepad.exe                               no        The Name of payload exe.
   FILENAME        evil.pdf                                                      no        The output filename.
   INFILENAME      C:/metasploit-framework/embedded/framework/data/exploits/CVE  yes       The Input PDF filename.
                   -2010-1240/template.pdf
   LAUNCH_MESSAGE  To view the encrypted content please tick the "Do not show t  no        The message to display in the File: area
                   his message again" box and press Open.


Payload options (windows/meterpreter/reverse_tcp):

   Name      Current Setting  Required  Description
   ----      ---------------  --------  -----------
   EXITFUNC  process          yes       Exit technique (Accepted: '', seh, thread, process, none)
   LHOST     172.22.217.140   yes       The listen address (an interface may be specified)
   LPORT     4444             yes       The listen port

   **DisablePayloadHandler: True   (no handler will be created!)**


Exploit target:

   Id  Name
   --  ----
   0   Adobe Reader v8.x, v9.x / Windows XP SP3 (English/Spanish) / Windows Vista/7 (English)


msf6 exploit(windows/fileformat/adobe_pdf_embedded_exe) > run

[*] Reading in 'C:/metasploit-framework/embedded/framework/data/exploits/CVE-2010-1240/template.pdf'...
[*] Parsing 'C:/metasploit-framework/embedded/framework/data/exploits/CVE-2010-1240/template.pdf'...
[-] Sorry, I'm picky. Incompatible PDF structure, please try a different PDF template.
msf6 exploit(windows/fileformat/adobe_pdf_embedded_exe) >

I think this may be related to comment above about using File.read instead of File.binread on payload_exe. This also raises the bigger question of if other modules may require similar adjustments. Will test these changes now and report back.

Edit: Its still failing for me, not sure what is going on here...

@sjanusz-r7 sjanusz-r7 force-pushed the replace-io-with-file branch 3 times, most recently from 06f3937 to 96aff11 Compare March 21, 2022 12:04
@adfoster-r7
Copy link
Contributor

@gwillcox-r7 I believe that issue should be resolved now 🤞

For visibility; We expanded on the scope of this PR a bit from just updating the IO.read calls. It now focuse on fixing the most obvious scenarios were uses of File.read and IO.read should have been using File.binread or specifying mode rb for reading binary files.

We've made the world a better place, but there's more than likely still gremlins, but it should be shippable in its current state. A potential gremlin is the scenario of uploading user specified files to a remote target, whether line endings should be normalised or not, and the impact that has on uploading binary files. Other tools solve this by letting the user specify ascii/binary modes, i.e. in the case of bsd's ftp client, or just always uploading in binary mode, i.e. samba's smbclient.

@gwillcox-r7
Copy link
Contributor

Looks like I made a mistake and the module in question needed the changes made to the files modules/auxiliary/fileformat/badpdf.rb and lib/msf/core/exploit/pdf_parse.rb as well for testing purposes. With these changes included the module works as expected:

msf6 > use exploit/windows/fileformat/adobe_pdf_embedded_exe
[*] No payload configured, defaulting to windows/meterpreter/reverse_tcp
msf6 exploit(windows/fileformat/adobe_pdf_embedded_exe) > set exename C:\Windows\System32\notepad.exe
exename => C:WindowsSystem32notepad.exe
msf6 exploit(windows/fileformat/adobe_pdf_embedded_exe) > set exename "C:\Windows\System32\notepad.exe"
exename => C:\Windows\System32\notepad.exe
msf6 exploit(windows/fileformat/adobe_pdf_embedded_exe) > run

[*] Reading in 'C:/metasploit-framework/embedded/framework/data/exploits/CVE-2010-1240/template.pdf'...
[*] Parsing 'C:/metasploit-framework/embedded/framework/data/exploits/CVE-2010-1240/template.pdf'...
[*] Using 'C:\Windows\System32\notepad.exe' as payload...
[+] Parsing Successful. Creating 'evil.pdf' file...
[+] evil.pdf stored at C:/Users/normal/.msf4/local/evil.pdf
msf6 exploit(windows/fileformat/adobe_pdf_embedded_exe) >

Will do one last inspection over code and then get this landed, thanks for making these changes!

@gwillcox-r7
Copy link
Contributor

This should be good to land, unfortunately GitHub had some hicups earlier with their API and similar so I wasn't sure if I should land or not, but this should be good to go.

@gwillcox-r7 gwillcox-r7 added the rn-fix release notes fix label Mar 24, 2022
@gwillcox-r7 gwillcox-r7 merged commit bf88b7f into rapid7:master Mar 24, 2022
@gwillcox-r7
Copy link
Contributor

gwillcox-r7 commented Mar 24, 2022

Release Notes

This PR replaces IO.read with File.binread, in scenarios where it's obvious that we're reading from binaries, to prevent an issue where not all of the file has been read correctly due to an additional EOL<->CRLF conversion that happens on Windows.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants