-
Notifications
You must be signed in to change notification settings - Fork 13.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: search_path
in RDS
#24739
fix: search_path
in RDS
#24739
Conversation
Codecov Report
@@ Coverage Diff @@
## master #24739 +/- ##
==========================================
- Coverage 68.97% 68.89% -0.08%
==========================================
Files 1901 1901
Lines 74008 73935 -73
Branches 8183 8183
==========================================
- Hits 51047 50941 -106
- Misses 20840 20873 +33
Partials 2121 2121
Flags with carried forward coverage won't be shown. Click here to find out more.
... and 2 files with indirect coverage changes 📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
type hint in get_prequeries throws an error with Python 3.9
So I changed it to
Happy to test further once these are addressed :) |
@mdeshmu sorry, the initial commit was just a proof-of-concept to see how much refactoring would be necessary. I've updated this PR and tested it with Postgres and RDS. When running queries in SQL Lab the schema was set correctly in both based on the dropdown. Can you give it another try now? I'm going to work on adding some tests in the meantime. |
I have tested this and It does fix #24684 |
Right, I should've been more clear about this. This is probably because by default you have access to the whole database; in that case, setting the path is fine. The check only happens when the user doesn't have full database access, when we're checking if the user has access to the schema or the table. |
True |
SUMMARY
(See #24684 for more context.)
For security reasons, we need to be able to run SQL Lab queries in user-specified schemas. For example, in SQL Lab, when a user selects the schema
foo
from the dropdown and runs the following query:Since
some_table
is not fully qualified (it has no schema), we need to:foo
schema, so that the right table is queried.foo.some_table
(eg, by having access to thefoo
schema).[Note that if (1) is not true, we can't do (2), since we don't know the schema of
some_table
.]For Postgres, the way we currently do (1) is by setting
options=-csearch_path=foo
in the connection arguments every time the user runs a query. The problem is that the Postgres DB engine spec and the Postgres SQLAlchemy dialect are also used by RDS, which does not support options to be passed this way. In 3.0 RC1 it's not possible to query RDS because the search path is set for every query.Fortunately there's a way that works for both Postgres and RDS (and hopefully, other databases using the Postgres client): by running the query
set search_path=foo
when the connection is first created.This PR introduces the concept of "pre queries", ie, queries that are run as soon as a raw connection is created. This ensures ensure that the schema (and in the future, catalog) can be set per query correctly for Postgres and potentially other databases (the implementation is per DB-engine spec).
Note that if DML is enabled it's still possible for users to run
set search_path=bar
in SQL Lab. This could lead to security issues where:foo
as the schema in SQL Lab.set search_path=bar; SELECT * FROM some_table
.some_table
. Since the table name is not fully qualified and the dropdown is set tofoo
, Superset will check for access tofoo.some_table
, even though the table is actuallybar.some_table
.This would allow a malicious user that has access to a given schema (say,
foo
) to query tables in any schema by simply choosingfoo
from the schema dropdown in SQL Lab and specifying the search path in the query.To fix this problem (at least for Postgres), I extended the
get_default_schema_for_query
to check forset search_path
and raise an exception if it's there:Ideally, instead of raising an exception we would parse the query to figure out the schema, but this is not trivial. Consider, eg, the following query:
Because of the complexity of figuring out the correct schema for table in a query with multiple statements, I opted to raise an exception for now, hoping that in the future we can improve this.
BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF
TESTING INSTRUCTIONS
Added new unit tests.
ADDITIONAL INFORMATION