Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SSPI authentication #162

Closed
ekolb opened this issue Feb 11, 2014 · 24 comments
Closed

SSPI authentication #162

ekolb opened this issue Feb 11, 2014 · 24 comments
Labels

Comments

@ekolb
Copy link

ekolb commented Feb 11, 2014

SSPI Authentication does not work.

Steps:
I have set up a windows domain trying to use SSPI.
In the same server, it is no problem.
It is a problem when divided into AP server and DB server.

Step1.
AP Server and DB Server are set as the same domain.

Step2.
Windows firewall on DB Server setup.
It enables it to connect the port of PostgreSQL to DB Server from AP Server.

Step3.
The login roll of the same name as the login user of Windows is added to the login roll of PostgreSQL on DB Server.

Step4.
Method of pg_hba.conf of PostgreSQL of DB Server is changed into SSPI.
host all all 0.0.0.0/0 sspi
host all all ::1/0 sspi

Step5.
I run my .NET client program on AP server.

NpgsqlConnection conn = new NpgsqlConnection("Server=APserver;Port=5432;Database=postgres; Integrated Security= true;");
conn.Open();
NpgsqlCommand command = conn.CreateCommand();
command.CommandText = "select * from pg_locks";
command.ExecuteScalar();
conn.Close();

Result:
pg_log file
FATAL: could not accept SSPI security context
Detail:The token supplied to the function is invalid
(80090308)

When running client program on DB Server, it has connected using SSPI.
When using md5, id and password were set up and it has connected from AP Server.

possible solution:
SSPI connect does not work between npgsql client and postgres server in a domain.
Whereas it works with command line psql or a Qt application (using libpq).

Capturing network traffic reveals something interesting:

works psql

2014-01-31 06:40:16 PST DEBUG: Processing received SSPI token of length 40
2014-01-31 06:40:16 PST DEBUG: sending SSPI response token of length 264
2014-01-31 06:40:16 PST DEBUG: sending GSS token of length 264
2014-01-31 06:40:16 PST DEBUG: SSPI continue needed
2014-01-31 06:40:16 PST DEBUG: Processing received SSPI token of length 562
2014-01-31 06:40:16 PST LOG: connection authorized: user=mynewuser database=mdb

not work npgsql

2014-01-31 06:35:10 PST DEBUG: Processing received SSPI token of length 41
2014-01-31 06:35:10 PST DEBUG: sending SSPI response token of length 264
2014-01-31 06:35:10 PST DEBUG: sending GSS token of length 264
2014-01-31 06:35:10 PST DEBUG: SSPI continue needed
2014-01-31 06:35:10 PST DEBUG: Processing received SSPI token of length 563
2014-01-31 06:35:10 PST FATAL: could not accept SSPI security context
2014-01-31 06:35:10 PST DETAIL: The token supplied to the function is invalid

(80090308)

works qt

2014-01-31 06:47:18 PST DEBUG: Processing received SSPI token of length 40
2014-01-31 06:47:18 PST DEBUG: sending SSPI response token of length 264
2014-01-31 06:47:18 PST DEBUG: sending GSS token of length 264
2014-01-31 06:47:18 PST DEBUG: SSPI continue needed
2014-01-31 06:47:18 PST DEBUG: Processing received SSPI token of length 562

2014-01-31 06:47:18 PST LOG: connection authorized: user=mynewuser database=mdb

Looks like npgsql is sending an extra char.

I checked the code (2.0.14.3) and made two changes to remove the extra char:

NpgsqlPasswordPacket.cs: function WriteToStream:
case ProtocolVersion.Version3:
outputStream.WriteByte((Byte) 'p');
/// PGUtil.WriteInt32(outputStream, 4 + password.Length + 1);
PGUtil.WriteInt32(outputStream, 4 + password.Length);

PgUtil.cs: function writeBytes
network_stream.Write(the_bytes, 0, the_bytes.Length);
// comment this out
// network_stream.Write(new byte[1], 0, 1);


After that sspi worked.

@franciscojunior
Copy link
Member

Thanks @ekolb ! We will have a look at this. Your feedback is very good. It will help us very much to fix this problem.

@the21st
Copy link

the21st commented Feb 17, 2014

Hi! I am having the same issue and have resolved it by the modification suggested by @ekolb .

Thanks :)

@franciscojunior
Copy link
Member

Hi all!

This is now fixed in master branch. Please give it a try and let me know if it works ok for you.

@the21st
Copy link

the21st commented Feb 24, 2014

Hi @franciscojunior, I just tried compiling from master and it works, but it seems to be a lot slower than the .dll I built from 2.0.14.3 with the changes proposed here (Both are Release builds for .NET 4.0)

@franciscojunior
Copy link
Member

Hmmmm, I think this is caused by an LDAP query the new code does.
I'll check with @fluggo if we can make something about it.

Can you do a test for me, please? Can you tell what is the UserName being used? This can be obtained with this code:

new NpgsqlConnectionStringBuilder("Server=APserver;Port=5432;Database=postgres; Integrated Security= true;").UserName;

Thanks in advance.

@the21st
Copy link

the21st commented Feb 24, 2014

It's "simon.sotak", my company username.

@fluggo
Copy link

fluggo commented Feb 24, 2014

Yes, the LDAP lookup for the user name is likely to make new connections slow. I was waiting for a flash of inspiration on how to overcome that.

As luck would have it, I just thought of something. I could cache the UPN by SID. Let me whip up a few tests and see what I come up with.

@franciscojunior
Copy link
Member

@fluggo , I was thinking about the following scenario: @the21st doesn't seem to be using ldap server, or at least didn't need one to make SSPI connection work. So, I was thinking about another approach where we could skip the ldap query.
I'm asking that because if the slow down was caused by timeout trying to connect to an non-existent ldap server, even if you use a cache strategy, it will continue to be slow.

What do you think?

@franciscojunior
Copy link
Member

And by skip the ldap query, I mean we could make this query optional and only usable by those who really want/need to use it.

@fluggo
Copy link

fluggo commented Feb 24, 2014

Like I said before, I think he succeeded because his username is all lowercase. Had his username included capital letters, he probably would have needed the LDAP query.

The LDAP query is basically necessary because of a Windows issue of not hanging on to the user's UPN with its proper, Active Directory-sanctioned case. So in the most common scenario of using integrated security to connect from a Windows domain machine, this is the right thing to do, and it should always work. Disabling it by default would mean breaking integrated security for anyone with a capital letter in their username.

The code probably won't work in two scenarios I can think of: connecting from a non-domain Windows machine (because the machine probably won't understand an LDAP://rootDSE query, I might be able to address this), and connecting using Kerberos from a Linux machine (because the code assumes the availability of a WindowsIdentity and SID).

Short of a general solution that includes those two scenarios, the question comes down to who do you want to inconvenience more: those using Integrated Security from a non-domain or non-Windows machine, or Windows users with capital letters in their UPN?

@franciscojunior
Copy link
Member

Like I said before, I think he succeeded because his username is all lowercase. Had his username included capital letters, he probably would have needed the LDAP query.

I see. I don't have a test setup for integrated security so I didn't know the capital letter in the username would make such a difference.

Short of a general solution that includes those two scenarios, the question comes down to who do you want to inconvenience more: those using Integrated Security from a non-domain or non-Windows machine, or Windows users with capital letters in their UPN?

Good question. I think the most common usage of integrated security is from domain based windows users. So, I think users on non domain or non-windows machines would face a little delay when trying to use integrated security.
Would it be possible that we add another check for non windows machines? This way we could save the ldap query which wouldn't work anyway.

@franciscojunior
Copy link
Member

@the21st , can you merge the latest @fluggo patch #172 and check if the slowdown you are facing is diminished? Thanks in advance.

@the21st
Copy link

the21st commented Feb 24, 2014

As one would expect with a cache, the slowdown is present during the first request and eliminated in the subsequent queries. I assume the slowdown will emerge once every 30 seconds? Which, in the case I do a query once per minute, can be painful :)

@fluggo
Copy link

fluggo commented Feb 24, 2014

Hmm. Well, I can increase it. How long is it taking you to connect? Are we talking multiple seconds? You might actually be encountering an exception in the LDAP code, and I wish there was a good way to report it; can you ask your debugger to stop on all exceptions and see if there's an exception thrown there?

I'm sitting on the same network as my domain servers, so in my test I was doing 500 straight connection attempts, and I went from doing that in eight seconds to doing it in two.

@fluggo
Copy link

fluggo commented Feb 24, 2014

@franciscojunior I can see what it will take to get this working from a non-domain Windows machine. I hadn't planned to, but I don't want this to be a major issue.

@the21st
Copy link

the21st commented Feb 24, 2014

I tried stopping on all exceptions, but none were thrown.

The first connection takes about 1.5 secs. The thing is, I am using Npgsql in a stateless app that creates a new connection every time the user requests a save/load from a DB. Which can happen at any time interval. So that's a pretty specific use, I don't want you to accommodate the library to my needs.

I am using the .dll I compiled from the changes suggested by the issue creator and everything works just fine.

@fluggo
Copy link

fluggo commented Feb 24, 2014

Ahh. Well, I chose 30 seconds just because it was the smallest effective number I could think of, but honestly, the SID to UPN mapping is really, really unlikely to change. I could put the cache timeout at 24 hours and fix your use case. I think I'll do that.

@franciscojunior
Copy link
Member

@franciscojunior I can see what it will take to get this working from a non-domain Windows machine. I hadn't planned to, but I don't want this to be a major issue.

Ahh, ok. I thought that on a non windows machine, only the username got from WindowsIdentity.GetCurrent() would be enough.

But for while, I think the current strategy will work for most of the use cases. We can think about how to make it work on non-windows machines in the future.

@fluggo
Copy link

fluggo commented Feb 24, 2014

@franciscojunior Welp. Turns out that trying to get this to work (automagically, anyways) from a non-domain Windows machine is a whole other can of worms.

So let's start from the top.

  1. PostgreSQL expects a user name in its startup packet. It throws a fit if it's missing. (I just tried.) So we need to specify a user name.
  2. GSSAPI and SSPI specify an entire Kerberos ticket as their password, and that ticket includes the Kerberos user name.
  3. PostgreSQL expects these to match, and the match is case-sensitive.
  4. A programmer coming from the normal world of integrated security expects to just be able to set Integrated Security=true, and the rest will just work. Windows knows the user name, right?
  5. Windows is Windows and doesn't care about the user name's case, so it doesn't care about getting it right when it create the user's token. I've heard the case matches what the user typed at the login prompt, I don't know.

So there's the rationale for the LDAP query. We need to contact the domain and ask it what the user's official UPN is, case and all.

Now, if you're on a Windows machine that isn't on the domain, you can't actually log in as a domain user. You can't even really impersonate one. What you can do is attach new network credentials to your user token using the LOGON32_LOGON_NEW_CREDENTIALS logon type (same as using runas /netonly). We do this in our in-house apps to enable them to run on users' home machines against SQL Server. Works like a charm.

However, those credentials are not available to the running program. I cannot discover what your LOGON32_LOGON_NEW_CREDENTIALS are, because I am not actually running as that user. I can construct and send a Kerberos ticket to Postgres, and this will contain your Kerberos user name, BUT I will not be able to determine the right user name to send in the Postgres startup packet.

So really, the correct solution here is for Postgres to not require the user name in the startup packet when using SSPI or GSSAPI (or at least ignore it). It should discover the name from the Kerberos ticket we send. That will always be accurate, and it doesn't require us to do all this extra work. It would work just like integrated security on SQL Server.

That's the long, painful story. But there is a workaround: if you're on a non-domain machine, specify the user name. Otherwise, doing the right thing requires a server-side change.

@fluggo
Copy link

fluggo commented Feb 24, 2014

Filed this as bug 9337 with Postgres.

@franciscojunior
Copy link
Member

That's the long, painful story. But there is a workaround: if you're on a non-domain machine, specify the user name. Otherwise, doing the right thing requires a server-side change.

Wow! Thanks for the thoroughly explanation of the situation, Brian! I didn't know anything about that.

I'll add your explanation to the documentation of integrated security. But I think integrated security on domain-windows machines will cover the majority of the use cases.

@franciscojunior
Copy link
Member

@ekolb , can you check if current code fixes the issue for you too, please? This way we can close this issue. Thanks in advance.

@franciscojunior
Copy link
Member

@ekolb , if you still find problems with SSPI connections, please let us know.
#170 fixed this problem. Closing now.

@ekolb
Copy link
Author

ekolb commented Aug 25, 2014

Thanks.
I am OK with closing.

Emanuel

From: Francisco Figueiredo Jr. [mailto:notifications@github.com]
Sent: Sonntag, 17. August 2014 21:06
To: npgsql/Npgsql
Cc: Emanuel Kolb
Subject: Re: [Npgsql] SSPI authentication (#162)

@ekolbhttps://github.com/ekolb , if you still find problems with SSPI connections, please let us know.
#170#170 fixed this problem. Closing now.


Reply to this email directly or view it on GitHubhttps://github.com//issues/162#issuecomment-52431427.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants