Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Postgres returns inserted data in exception if insert fails #2501

Open
williamdenton opened this issue Jun 19, 2019 · 15 comments

Comments

@williamdenton
Copy link
Contributor

commented Jun 19, 2019

Postgres returns row data in the Detail section of its errors which is in turn captured in the PostgresException Data dictionary using the key Detail key. At first glance this makes sense but it is also leaking database data into exceptions which often get logged.

Detail = GetValue<string>(nameof(Detail));

For the relatively common case of violating a not-null constraint this is the data dictionary that is serialised into logs:

...
"Data": {
    "Severity": "ERROR",
    "SqlState": "23502",
    "Code": "23502",
    "MessageText": "null value in column \"email\" violates not-null constraint",
    "Detail": "Failing row contains (123, william, denton, null, 2019-06-19 01:55:53.6688).",
    "SchemaName": "service",
    "TableName": "person",
    "ColumnName": "email",
    "File": "execMain.c",
    "Line": "2008",
    "Routine": "ExecConstraints"
},
"ClassName": "Npgsql.PostgresException",
"Message": "23502: null value in column \"email\" violates not-null constraint",
"StackTrace": ...
...

In this case it's just some PII (Personally identifiable information) of the user I inserted into my person table. But this could be much, much more serious; passwords, social security numbers, even credit card numbers and other secret tokens. Inserted data lives in the DB not in a logging system.

Proposal

I realise this is an issue with postgres itself and npgsql is just passing the message on, but I propose Npgsql should suppress the Detail message in exceptions, and instead provide an opt-in way of populating potentially unsafe properties if the user desires them.
This could be viewed as a breaking change, but leaving the detail message there is not safe by default and many people will be unwittingly logging things they didn't realise.

@YohDeadfall

This comment has been minimized.

Copy link
Member

commented Jun 19, 2019

I understand your concerns, but as you said Npgsql is a driver and it isn't its responsibility. At the same time other systems could spoil secure data. So it would be better to solve such things at logging level by:

  • limiting access to the logs and having audit enabled;
  • writing an interceptor which will erase sensitive data from messages before writing them.

The first way is harder, but it provides best security and prevents any leaks. Even if they happen, you would be able to determine a person which hurt the business.

The last way and what you proposed aren't a panacea. It will just give a false safety feeling without having the first.

@YohDeadfall

This comment has been minimized.

Copy link
Member

commented Jun 19, 2019

By the way, it would be better to raise the issue in the PostgreSQL hacker list.

@roji

This comment has been minimized.

Copy link
Member

commented Jun 19, 2019

Here's my point of view... As @YohDeadfall and @williamdenton wrote, it's true that Npgsql is just a driver passing along messages to and from PostgreSQL. Our policy in general is to be as pass-through as possible, exposing what PostgreSQL exposes.

However, I do agree there's an issue here, and wasn't aware that actual data was being exposed via the Detail field. The risk is made worse by the fact that we have the Data dictionary property, which can sometimes be accessed in GUIs by generic exception display components (i.e. with a property grid). Even if PostgreSQL were to add a config flag on their side for disabling transmitting this information, it still seems to make sense for Npgsql to have a "development" mode where data is exposed, vs. a "production" mode where it isn't.

Note that we already have IsParameterLoggingEnabled, which controls whether parameter contents are logged (disabled by default). This isn't exactly the same (e.g. logging vs. exception), but is also somewhat similar (potentially sensitive data that could be useful in development).

So I think it's a good idea to hide the detail information unless an explicit opt-in is used. If the user hasn't opted in, the default Detail message could be something to help users understand how to opt in (e.g. <REDACTED>: please set PostgresException.IncludeDetailMessage to true for potentially sensitive user information). I think we should also consider backporting this to previous branches, as this is about security (it's hard to really consider this a breaking change as we'd simply be changing the Detail message text).

Finally, as @YohDeadfall mentioned it's a good idea to raise this in the PG hacker list (please post a thread link here when you do).

Any objections/comments? Would you be OK with this change @YohDeadfall?

@YohDeadfall

This comment has been minimized.

Copy link
Member

commented Jun 19, 2019

I would like to hear PostgreSQL developers first because it could be a larger issue. Don't forget about other data providers written for other platforms, they could be affected too. Otherwise, I'm fine with the development mode.

@williamdenton

This comment has been minimized.

Copy link
Contributor Author

commented Jun 20, 2019

I have submitted a question to pgsql-general as the instructions are quite clear that you must have asked elsewhere before pgsql-hackers is used.

Heres the example i used

postgres=# create database test;
CREATE DATABASE
postgres=# \c test;
psql (11.1 (Debian 11.1-3.pgdg90+1), server 11.2)
You are now connected to database "test" as user "postgres".
test=# create table person (firstname text not null, lastname text not
null, email text not null);
CREATE TABLE
test=# insert into person values ('william', 'denton', null);
ERROR:  null value in column "email" violates not-null constraint
DETAIL:  Failing row contains (william, denton, null).
test=# insert into person values ('william', 'denton', 'email(at)example(dot)com');
INSERT 0 1
test=# update person set email = null;
ERROR:  null value in column "email" violates not-null constraint
DETAIL:  Failing row contains (william, denton, null).
test=#
@williamdenton

This comment has been minimized.

Copy link
Contributor Author

commented Jun 23, 2019

Takeaways from the mailing list thread:

log_error_verbosity Database parameter

There is a db level param that controls wether the detail field is returned, this is quite a big hammer as sometimes there is useful data in there, but it solves this problem. But the default database setting is to return the row data.

https://www.postgresql.org/docs/11/runtime-config-logging.html#RUNTIME-CONFIG-LOGGING-WHAT

Controls the amount of detail written in the server log for each message that is logged. Valid values are TERSE, DEFAULT, and VERBOSE, each adding more fields to displayed messages. TERSE excludes the logging of DETAIL, HINT, QUERY, and CONTEXT error information. VERBOSE output includes the SQLSTATE error code (see also Appendix A) and the source code file name, function name, and line number that generated the error. Only superusers can change this setting.

From the discussion on the mailing list isn't going to result in a change in behaviour on the postgres side. The DB has many use-cases to which they cater. This particular concern isn't going to change the overall approach from the postgres maintainers.

I'm happy to make a PR to redact this information if you see that as a valid way forward for Npgsql.

@roji, were you thinking of a static bool property to implement this?

If the user hasn't opted in, the default Detail message could be something to help users understand how to opt in (e.g. : please set PostgresException.IncludeDetailMessage to true for potentially sensitive user information).


My current mitigation for this (as we really don't want row data in our logs and only just learnt of the db setting):
Extending our tracing configuration which leans on the miniprofiler core classes to nuke the property on the PostgresException before it bubbles up to client code.

https://github.com/MiniProfiler/dotnet/blob/25691149fb424e486cd273e72d31603266278a6c/src/MiniProfiler.Shared/Data/ProfiledDbCommand.cs#L219-L228

https://github.com/MiniProfiler/dotnet/blob/25691149fb424e486cd273e72d31603266278a6c/src/MiniProfiler.Shared/Data/IDbProfiler.cs#L38-L44

@YohDeadfall

This comment has been minimized.

Copy link
Member

commented Jun 24, 2019

Static properties are evil and affect all users in an application. So I think we should add another parameter to the connection string, and it should be generic to handle security issues (parameter logging, details and other cases), maybe an enum for extensibility.

@williamdenton

This comment has been minimized.

Copy link
Contributor Author

commented Jun 24, 2019

I was curious about how @roji envisioned the code working from his suggested error message. Upon more reflection it could be an instance property or function on the exception that moves the detail message from a private var on the postgres exception into the publicly accessible data dictionary and properties on the exception?

Altering the connection string for this seems very invasive and broad reaching, unless we can see a broader category of things we want to control with the same switch

@YohDeadfall YohDeadfall added enhancement and removed discussion labels Jun 24, 2019

@YohDeadfall YohDeadfall added this to the 4.1 milestone Jun 24, 2019

@YohDeadfall

This comment has been minimized.

Copy link
Member

commented Jun 24, 2019

Upon more reflection it could be an instance property or function on the exception that moves the detail message from a private var on the postgres exception into the publicly accessible data dictionary and properties on the exception?

Then an user will be forced to write interceptors to log details. So it would be better to have it as an option which can be turned on at any time without additional code and redeployment.

@roji

This comment has been minimized.

Copy link
Member

commented Jun 24, 2019

Static properties are evil and affect all users in an application. So I think we should add another parameter to the connection string, and it should be generic to handle security issues (parameter logging, details and other cases), maybe an enum for extensibility.

I actually don't think this merits a connection string parameter... It's very hard for me to imagine an application where you'd want this in one connection but not in another within the same application - typically this would be on in development, and off in production. Adding more and more parameters to the connection string makes it longer and more unwieldy in general.

So it would be better to have it as an option which can be turned on at any time without additional code and redeployment.

Can you explain in which scenarios you think this is necessary, justifying a connection string parameter? Again, I see this as something that would be always off in production (the default), and developers would only turn it on in development/debugging.

Note that we already have a static NpgsqLogManager.IsParameterLoggingEnabled which manages a similar thing, AFAIK nobody's ever requested to have it on on certain connections but not on others.

I was curious about how @roji envisioned the code working from his suggested error message.

I was simply imagining a static property. If false, the detail message isn't set on the exception, which instead has the fallback message explaining to the user what to do.

@YohDeadfall

This comment has been minimized.

Copy link
Member

commented Jun 24, 2019

It's very hard for me to imagine an application where you'd want this in one connection but not in another within the same application - typically this would be on in development, and off in production.

In out other repo you were asked about certificate callback configuration because Npgsql resides in Hangfire and you know nothing about the used provider. This can be handled via the connection string.

Adding more and more parameters to the connection string makes it longer and more unwieldy in general.

Only if it's a develop config or if an user wants to opt-in this in production environment. So it won't affect users so much.

Note that we already have a static NpgsqLogManager.IsParameterLoggingEnabled which manages a similar thing, AFAIK nobody's ever requested to have it on on certain connections but not on others.

Already covered that scenario in:

So I think we should add another parameter to the connection string, and it should be generic to handle security issues (parameter logging, details and other cases), maybe an enum for extensibility.

@roji

This comment has been minimized.

Copy link
Member

commented Jun 24, 2019

It's very hard for me to imagine an application where you'd want this in one connection but not in another within the same application - typically this would be on in development, and off in production.

In out other repo you were asked about certificate callback configuration because Npgsql resides in Hangfire and you know nothing about the used provider. This can be handled via the connection string.

I think certificate callbacks are very different from the decision to log user-sensitive data. It's very logical for the former to change within the same application (i.e. connecting to multiple databases), but the same doesn't seem true for the latter. What kind of scenario do you have in mind where a user would want sensitive info for one connection but not for another?

(unfortunately for the certificate callbacks we can't fully do it via the connection string, since it's callbacks and not a setting - but we do have #2129).

@YohDeadfall

This comment has been minimized.

Copy link
Member

commented Jun 24, 2019

What kind of scenario do you have in mind where a user would want sensitive info for one connection but not for another?

When one connection doesn't access secure information, but details would help you understand what's going on. Probably it's a rare case in the era of micro services, but there are a lot of services with plugins and monoliths.

Another vital case is debugging in production when you need to understand what's going on without changes in the code and without exposing other critical info. It's not a good thing, but shit happens.

In other words, I would like to see this feature to be enabled outside of an application with minimal user efforts.

AFAIK nobody's ever requested to have it on on certain connections but not on others

This doesn't mean that we shouldn't seek for an ideal API (:

@roji

This comment has been minimized.

Copy link
Member

commented Jun 26, 2019

Another vital case is debugging in production when you need to understand what's going on without changes in the code and without exposing other critical info. It's not a good thing, but shit happens.
In other words, I would like to see this feature to be enabled outside of an application with minimal user efforts.

I admit that's a pretty convincing argument. It would also be nice to manage parameter logging in the same way.

We could have a single enum parameter to manage this, although the two settings are pretty distinct - one is about logging (parameter values), the other is about parsing the server detail message. Maybe two parameters?

@austindrenski

This comment has been minimized.

Copy link
Member

commented Jul 1, 2019

We could have a single enum parameter to manage this, although the two settings are pretty distinct - one is about logging (parameter values), the other is about parsing the server detail message. Maybe two parameters?

Thinking about a data pipeline app, I can recall circumstances in which I've needed to log failed row inserts, but not parameter data (e.g. mixed sources where parameters are user-derived, but row values are from public third-party sources).

So separate setting levels would be nice. Maybe one enum parameter marked as a flag?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants
You can’t perform that action at this time.