Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SMBv1 stack doesn't talk unicode on the wire #51

Closed
GoogleCodeExporter opened this issue Apr 15, 2015 · 15 comments
Closed

SMBv1 stack doesn't talk unicode on the wire #51

GoogleCodeExporter opened this issue Apr 15, 2015 · 15 comments
Labels
enhancement Implemented features can be improved or revised

Comments

@GoogleCodeExporter
Copy link

Hi mate,

sorry, another encoding / decoding issue... again... :)

This one can be triggered on Windows systems prior to Vista only (2000, XP, 
2003, etc.) because it only affects the old SMBv1 stack implemented in smb.py 
(the issue doesn't affect smb3.py). To trigger the bug on recent systems, you 
have to force the SMB Dialect to SMB_DIALECT.

I found this bug while browsing a remote filesystem with filenames created 
using a non "ASCII derivated" code page like cp866 for cyrillic charset (don't 
ask any questions concerning this remote filesystem :]). 

By the way, the default code page on the remote server wasn't cp866 (this file 
probably came from a previous windows migration or something like that).

How to reproduce the bug?:
+++++++++++++++++++++++++

1. create a directory on the remote server using code page 866:

C:> chcp 866
C:> mkdir Ь

Ь = \x9c in cp866 and \u042C in unicode

2. connect to the remote server using smbclient.py

$> python smbclient.py 
Impacket v0.9.13-dev - Copyright 2002-2014 Core Security Technologies

Type help for list of commands
# open 1.2.3.4
[*] SMBv1 dialect used
# login DOMAIN/Administrator
Password:
[*] USER Session Granted
# use C$
# ls
-rw-rw-rw-          0  Thu Jun 21 15:01:57 2012 AUTOEXEC.BAT
-rw-rw-rw-        211  Thu Jun 21 14:57:29 2012 boot.ini
-rw-rw-rw-          0  Thu Jun 21 15:01:57 2012 CONFIG.SYS
drw-rw-rw-          0  Mon Nov 18 11:28:12 2013 Documents and Settings
drw-rw-rw-          0  Tue Sep  2 18:06:04 2014 Ê? <----------------------- 
here is the weird directory
-rw-rw-rw-          0  Thu Jun 21 15:01:57 2012 IO.SYS
-rw-rw-rw-          0  Thu Jun 21 15:01:57 2012 MSDOS.SYS
drw-rw-rw-          0  Tue Sep 24 13:38:20 2013 MSOCache
-rw-rw-rw-      47564  Mon Apr 14 13:59:59 2008 NTDETECT.COM
-rw-rw-rw-     250048  Mon Apr 14 13:59:59 2008 ntldr
-rw-rw-rw- 1610612736  Mon Nov 18 11:34:04 2013 pagefile.sys
drw-rw-rw-          0  Wed Sep 25 00:13:50 2013 Program Files
drw-rw-rw-          0  Tue Sep 24 22:17:15 2013 RECYCLER
drw-rw-rw-          0  Thu Jun 21 15:20:23 2012 System Volume Information
drw-rw-rw-          0  Tue Oct 29 15:10:19 2013 temp
drw-rw-rw-          0  Wed Sep  3 18:50:13 2014 WINDOWS

3. try to browse the directory

# cd Ь
[!] SMB SessionError: STATUS_OBJECT_NAME_NOT_FOUND(The object name is not found.
# cd Ê?
[!] SMB SessionError: STATUS_OBJECT_NAME_INVALID(The object name is invalid.)

As you can see, you can't browse the directory because your SMB stack doesn't 
properly encode the filename before sending it on the wire.


Where is the issue located?:
++++++++++++++++++++++++++++

After some investigations, it seems that your SMBv1 stack talks to the remote 
server using non-unicode strings unlike smb3.py (technically, the flag 
SMB.FLAGS2_UNICODE is not set in the "Flags2" parameter). Consequently, when 
your stack receives or sends a packet, all strings (including filenames) are 
encoded using a "by default" encoding like cp1252 or something like that. 

So, if the original filename's encoding isn't an "ASCII derivated" one, your 
stack won't properly convert the directory's name before sending it on the wire 
and finally, the server won't be able to find the require directory...

In the previous case, create a directory called "Ь" using cp866 will produce 
this binary string on the remote server: \x9c. 

But when you will list the directory containing this file, the server will sent 
the following binary string on the wire: \xca\x3f\x00 (trouble begins...). 

Then, when you will try to browse the directory using the SMBv1 stack, the 
following binary string will be sent on the wire: \x5c\xd0\xac\x00 (it becames 
anything and everything except the right name...). From the remote server's 
point of view, this filename doesn't mean anything and return an error.

This issue doesn't affect smb3.py because this stack talks unicode with the 
remote server and even if your filename is fucked up, they will understand each 
other.


How to fix it:
++++++++++++++

There is no easy way to fix the issue but I think that standardize SMBv1 and 
SMBv2 stacks should be the best solution. By "standardize", I mean make sure 
that your both APIs talks Unicode on the wire like smb3.py already does. By 
talking Unicode, open file with a name encoded with a pretty weird code page 
won't break your SMB stack.


The patch:
+++++++++

Here is my patch to make it work. It prevents encoding / decoding problem in 
your SMBv1 stack even if filenames use weird and historical code pages.

After several tests, it doesn't break your examples. I just change smbclient.py 
a little to convert input paths to unicode before sending it to your SMB stack:

$> python smbclient.py 
Impacket v0.9.13-dev - Copyright 2002-2014 Core Security Technologies

Type help for list of commands
# open 1.2.3.4
[*] SMBv1 dialect used
# login DOMAIN/Administrator
Password:
[*] USER Session Granted
# use C$
# ls
-rw-rw-rw-          0  Thu Jun 21 15:01:57 2012 AUTOEXEC.BAT
-rw-rw-rw-        211  Thu Jun 21 14:57:29 2012 boot.ini
-rw-rw-rw-          0  Thu Jun 21 15:01:57 2012 CONFIG.SYS
drw-rw-rw-          0  Mon Nov 18 11:28:12 2013 Documents and Settings
drw-rw-rw-          0  Tue Sep  2 18:06:04 2014 ╨м <----------------------- 
here is the weird directory
-rw-rw-rw-          0  Thu Jun 21 15:01:57 2012 IO.SYS
-rw-rw-rw-          0  Thu Jun 21 15:01:57 2012 MSDOS.SYS
drw-rw-rw-          0  Tue Sep 24 13:38:20 2013 MSOCache
-rw-rw-rw-      47564  Mon Apr 14 13:59:59 2008 NTDETECT.COM
-rw-rw-rw-     250048  Mon Apr 14 13:59:59 2008 ntldr
-rw-rw-rw- 1610612736  Mon Nov 18 11:34:04 2013 pagefile.sys
drw-rw-rw-          0  Wed Sep 25 00:13:50 2013 Program Files
drw-rw-rw-          0  Tue Sep 24 22:17:15 2013 RECYCLER
drw-rw-rw-          0  Thu Jun 21 15:20:23 2012 System Volume Information
drw-rw-rw-          0  Tue Oct 29 15:10:19 2013 temp
drw-rw-rw-          0  Thu Sep  4 11:42:42 2014 WINDOWS
# cd ╨м
# ls
drw-rw-rw-          0  Thu Sep  4 11:43:28 2014 .
drw-rw-rw-          0  Thu Sep  4 11:43:28 2014 ..

As you can see, display is not pretty good because my remote server doesn't use 
the cp866 code page as a default. So, when it tries to convert the filename to 
unicode in order to send it on the wire, it fails and sends an invalid unicode 
string. Technically, it tries to convert a filename created using cp866 to 
unicode using its default code page (I think it's cp1252) and it doesn't work 
obviously.

But with this patch, it doesn't matter because client and server understand 
each other. When I try to browse the cyrillic directory, I send the invalid 
unicoded filename that the remote server sent me before when I listed the 
parent directory. The remote server will land on its feet because it will try 
to decode the invalid unicoded filename using its default code page and will 
finally come back to the cp866 filename.

However, this patch will propably break SMB connections with pre-2000 systems 
(I think that they don't talk unicode on the wire...). Let the developer choose 
the encoding to use on the wire may be a solution.

Anyway, tell me what do you think about this patch. If you have any question, 
improvement or if you didn't understand anything concerning my speech, please 
let me know ;)

Original issue reported on code.google.com by renaud.d...@synacktiv.com on 4 Sep 2014 at 11:09

Attachments:

@GoogleCodeExporter
Copy link
Author

Hola mate!

Sorry for the delay response.. you rock!.. Let me dig your mail and I'll get 
back to you...Big thanks for taking a look at it (SMBv1 code is ugly).. and 
yes.. it does not support Unicode connections. Time to attack this issue :)

cheers!
beto

Original comment by bet...@gmail.com on 8 Sep 2014 at 1:36

  • Changed state: Accepted

@GoogleCodeExporter
Copy link
Author

Hmm.. still trying to repro this issue based on your repro steps.

I've used a Windows 7 as the target.

1. chcp 866
2. mkdir b

then I forced smbclient.py to connect using smbv1 (change smbconnection.py and 
force the SMBConnection.__init__() preferredDialect to SMB_DIALECT

I connected to the target system.. smbclient.py is telling me I'm with smbv1 
and I see the directory as 'b'.

Mate.. did you take the same steps as myself?.. I might be missing something.


Original comment by bet...@gmail.com on 16 Sep 2014 at 7:27

@GoogleCodeExporter
Copy link
Author

Be careful, in the "mkdir" command, it's not a "b" but a "Ь" (the cyrillic 
character). In your case, it still works because "b" is inherently and 
correctly converted to the ascii char code \x62 (cp866 and ascii table are the 
same for the ascii charset).

To reproduce the bug you have to create a directory or a file containing 
non-ascii char. The simpliest way is to create the char with Python and copy / 
paste the char in your Windows shell:

>>> print u"\u042c"
Ь

Tell me if it fixes your problem ;)

Original comment by renaud.d...@synacktiv.com on 17 Sep 2014 at 6:10

@GoogleCodeExporter
Copy link
Author

dumb me.. 

thanks for the clarification.. just managed to create that directory..

First problem I found is smbclient.py dumps an exception, even when running SMB 
v2.1 against the target.. And that has to do with the smbclient.py itself.  
Doesn't that happen to you? .. I'm running smbclient.py from OSx


Original comment by bet...@gmail.com on 17 Sep 2014 at 5:51

@GoogleCodeExporter
Copy link
Author

Ok.. just saw your precmd addition into smbclient.py to prevent this problem 
happening with SMB >= v2. I'm first committing that change.

For the rest, I want to carefully look at your changes since it might break 
many things .. It's gonna take some time.

thanks again!
beto

Original comment by bet...@gmail.com on 17 Sep 2014 at 6:28

@GoogleCodeExporter
Copy link
Author

It's ok, I probably missed something during my patching madness :) so, I think 
it's a good thing to carefully look at my changes. I don't have an overall view 
of the project and, yeah, it probably breaks something somewhere...

Anyway, if you need some help or if you have any question about my patch, 
please let me know ;)

Thanks to you for taking the time!

Renaud.

Original comment by renaud.d...@synacktiv.com on 18 Sep 2014 at 6:38

@GoogleCodeExporter
Copy link
Author

Nooo.. thank you for taking the time to provide a patch (and specially mess 
with smb.py..)

FYI, as far as I read, SMB_COM_TREE_CONNECT shouldn't encode unicode, from 
[MS-CIFS], 2.2.4.50.1:

Flags2 (2 bytes): The SMB_FLAGS2_UNICODE flag bit SHOULD be zero. Servers MUST 
ignore the SMB_FLAGS2_UNICODE flag and interpret strings in this request as 
OEM_STRING strings.<74>

Path (variable): A null-terminated string that represents the server and share 
name of the resource to which the client is attempting to connect. This field 
MUST be encoded using Universal Naming Convention (UNC) syntax. The string MUST 
be a null-terminated array of OEM characters, even if the client and server 
have negotiated to use Unicode strings.

This doesn't apply to SMB_COM_TREE_CONNECT_ANDX.

Let me know if you read something different.

thanks again!
beto

Original comment by bet...@gmail.com on 18 Sep 2014 at 12:21

@GoogleCodeExporter
Copy link
Author

Oh yes you're right. All my fault :) when I patched smb.py, I read the 
documentation about SMB_COM_TREE_CONNECT_ANDX only and not 
SMB_COM_TREE_CONNECT... In my head, both could use Unicode... sorry about that 
and thanks for the feedback.

Renaud.

Original comment by renaud.d...@synacktiv.com on 19 Sep 2014 at 9:10

@asolino asolino added enhancement Implemented features can be improved or revised and removed auto-migrated labels May 15, 2015
@asolino
Copy link
Collaborator

asolino commented Jun 15, 2015

I took a look at @rdubourguais Impacket's repo (https://github.com/rdubourguais/impacket). I see him using AsciiOrUnicodeStructure base class when defining SMB structures and I think that is the right approach for the problem.

I'd change, however, the talkUnicode() approach so we can make it more general. Also, current implementation does not work in the negotation protocol packet, which could make some cases to fail.

In ed1d479 I added an option to detach the SMB_COM_NEGOTIATE from the SMBConnection constructor so the caller will need to explicitly call SMBConnection.negotiateSession() passing the flags and dialects to be supported.

Given that functionality and the existence (but not used) of SMB.get_flags() and SMB.set_flags() I propose the following.

  1. Keep the current behaviour of the library as it is right now (no Unicode support by default for SMB1).
  2. If Unicode support is needed, initiate a connection this way:
smbConnection = SMBConnection(remoteName, remoteHost, manualNegotiate = True)
smbConnection.negotiateSession(flags1 = SMB.FLAGS1_PATHCASELESS, flags2 = SMB.FLAGS2_EXTENDED_SECURITY | SMB.FLAGS2_NT_STATUS | SMB.FLAGS2_UNICODE)
smbConnection.login(username, password)
  1. SMBConnection.negotiateSession() should save the flags1/flags2 for the rest of the operations performed in that class instance. I suggest using SMB.get_flags() and SMB.set_flags() and port all the SMB calls to use self.__flags1 and self.__flags2 when creating a NewSMBPacket()
  2. Then, for every SMB structure that needs to be build depending on the Unicode/ASCII configuration (AsciiOrUnicodeStructure) we can call it setting the flags parameter to self.__flags2.
  3. Add test a test suite that covers all the https://github.com/CoreSecurity/impacket/blob/master/impacket/testcases/SMB_RPC/test_smb.py cases both for Unicode and ASCII.

I think all the pieces are there, we just need to them together.
I can do point 5), the most boring one :P.

what say you?

@rdubourguais
Copy link
Contributor

Your approach is indeed more generic and sounds good to me. I can merge your approach with mine to implement points 3) and 4) in my impacket repo and then create a PR. I will let you implement point 5) if you insist :)

@asolino
Copy link
Collaborator

asolino commented Jun 15, 2015

@rdubourguais good to know we're on the same page. That sounds like a plan.
Be careful with self.__flags1 and self.__flags2 because sometimes (e.g. here https://github.com/CoreSecurity/impacket/blob/master/impacket/smb.py#L3139) these variables at overwritten inside the lib. I would change such occurrences to:

self.__flags1 |= SMB.FLAGS1_PATHCASELESS

or similar.

I'll do 5) ;)

asolino added a commit that referenced this issue Jun 15, 2015
Feature still not there but the idea is to implement it using manualNegotiate
and setting the appropiate flags using SMBConnection.negotiateSession().
Check #51 for details.
@asolino
Copy link
Collaborator

asolino commented Jun 15, 2015

  1. done in 8a1239f

@asolino
Copy link
Collaborator

asolino commented Jun 16, 2015

@rdubourguais just merged #68. I also added some minor additions based on errors thrown by the test cases.

Let me know if this solves the original intention for Unicode support. If so, you close this issue.

Thanks for your help! Looking forward to more additions ;)

@rdubourguais
Copy link
Contributor

It perfectly solves my original issue ;) But it seems I don't have the right permission to close the issue :)

Thanks to you for the great job you have done in this lib. It makes my daily job much more easier :]

@asolino
Copy link
Collaborator

asolino commented Jun 17, 2015

Great to know mate.. let's think on the next feature ;).. Ideas are welcomed...

thanks to you!.. closing this one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Implemented features can be improved or revised
Projects
None yet
Development

No branches or pull requests

3 participants